LLM Server

商業

$220.00 · 專為 iPad 設計。尚未針對 macOS 驗證。

Your Phone Is Now an AI Server LLM Server turns your iPhone or iPad into a private AI inference server. Run large language models entirely on-device, expose an OpenAI-compatible API to your local network, and chat with your model from any browser on any device — laptop, desktop, tablet, even another phone. No cloud, no subscription, no data leaving your hardware. Chat From Any Browser on Your Network Start the server and open the URL on any device sharing your Wi-Fi. The built-in web interface gives you a clean chat UI instantly — no client app to install, no account to create. Your laptop, your partner's tablet, a colleague's machine: all talking to a model running on your phone. OpenAI-Compatible API, Drop-In Ready A standard OpenAI-compatible API means LLM Server slots into any tool you already use — Continue, Open WebUI, LangChain, custom scripts, the lot. Chat completions, text completions, streaming (SSE), and model listing endpoints all supported. Ollama CLI commands work too. Fully Offline, Fully Yours - Download a GGUF model once and you're done with the cloud. Inference runs locally on Apple Metal GPU with no internet required. Your prompts, your conversations, your data — none of it leaves the device. Airplane mode works fine. Any GGUF Model From Hugging Face - Browse and download directly from Hugging Face with built-in search, or import your own files. LLaMA, Mistral, Phi, Gemma, Qwen, DeepSeek, and every other llama.cpp-supported architecture runs out of the box. Background downloads with progress tracking so you can keep working. Enterprise-Grade Security - TLS/HTTPS encryption — generate self-signed certificates or import your own chain and private key. - API key authentication — Bearer tokens with per-key management. Generate cryptographically secure keys or bring your own. - Bind control — lock to localhost, open to your LAN, or pin to a specific interface. Tune Every Knob Full control over generation: context size up to 32K, temperature, top-p, top-k, repeat penalty, frequency and presence penalties, max tokens, seed, GPU layer offloading, and thread count. Save presets globally or per model. Smart Resource Management Your phone stays responsive under load. Real-time thermal monitoring with automatic thread reduction under pressure and request rejection at critical temperatures. Memory-aware model loading with conservative budgeting. Configurable request queues and per-request timeouts. Built for Developers - Live API docs with copy-paste curl examples - Structured logging (debug, info, warning, error) - One-tap copy for server addresses and API keys What's Inside Dashboard with one-tap server control, model manager with download progress, complete settings hub (server, inference, security, API keys, developer tools), and guided onboarding for first-time setup.

  • 此 App 尚未收到足夠的評分或評論,無法顯示概覽。

- Added the support of vision models - Added support of Llama RPC Nodes (rpc node App coming soon)

開發者Linosec指出App的隱私權實務可能包含下方描述的資料處理。如需更多資訊,請參閱 開發者的隱私權政策

  • 不收集資料

    開發者不會從這個 App 收集任何資料。

    依據你使用的功能或你的年齡等因素,隱私權實務可能有所不同。 更多內容

    開發者尚未表示此 App 支援的輔助使用功能。 進一步瞭解

    供應商
    • Linosec
    大小
    • 569.5 MB
    類別
    • 商業
    相容性
    需要 iOS 16.4 或以上版本。
    • iPhone
      需要 iOS 16.4 或以上版本。
    • iPad
      需要 iPadOS 16.4 或以上版本。
    • Mac
      需要 macOS 13.3(或以上版本)以及配備 Apple M1(或以上版本)晶片的 Mac。
    • Apple Vision
      需要 visionOS 1.0 或以上版本。
    語言
    • 英文
    年齡分級
    18+
    • 18+
    • 偶爾
      現實暴力
      粗話或低俗幽默
      成人或暗示性主題
      驚悚/恐怖題材
      醫療資訊
      使用或提及酒精、菸草或毒品
      色情內容或裸露
      槍支或其他武器

      包含
      使用者生成內容
      健康或保健主題
    版權
    • © Linosec