LLM Server

LLM Server & API on your LAN

$4.99 · Designed for iPad. Not verified for macOS.

Run LLMs privately on iPhone or iPad. Browser chat UI on your Wi-Fi, API, offline Metal GPU inference. LLaMA, Mistral, Qwen, DeepSeek, any GGUF from Huggingface Your Phone Is Now an AI Server LLM Server turns your iPhone or iPad into a private AI inference server. Run large language models entirely on-device, expose an OpenAI-compatible API to your local network, and chat with your model from any browser on any device — laptop, desktop, tablet, even another phone. No cloud, no subscription, no data leaving your hardware. Chat From Any Browser on Your Network Start the server and open the URL on any device sharing your Wi-Fi. The built-in web interface gives you a clean chat UI instantly — no client app to install, no account to create. Your laptop, your partner's tablet, a colleague's machine: all talking to a model running on your phone. OpenAI-Compatible API, Drop-In Ready A standard OpenAI-compatible API means LLM Server slots into any tool you already use — Continue, Open WebUI, LangChain, custom scripts, the lot. Chat completions, text completions, streaming (SSE), and model listing endpoints all supported. Ollama CLI commands work too. Fully Offline, Fully Yours - Download a GGUF model once and you're done with the cloud. Inference runs locally on Apple Metal GPU with no internet required. Your prompts, your conversations, your data — none of it leaves the device. Airplane mode works fine. Any GGUF Model From Hugging Face - Browse and download directly from Hugging Face with built-in search, or import your own files. LLaMA, Mistral, Phi, Gemma, Qwen, DeepSeek, and every other llama.cpp-supported architecture runs out of the box. Background downloads with progress tracking so you can keep working. Enterprise-Grade Security - TLS/HTTPS encryption — generate self-signed certificates or import your own chain and private key. - API key authentication — Bearer tokens with per-key management. Generate cryptographically secure keys or bring your own. - Bind control — lock to localhost, open to your LAN, or pin to a specific interface. Tune Every Knob Full control over generation: context size up to 32K, temperature, top-p, top-k, repeat penalty, frequency and presence penalties, max tokens, seed, GPU layer offloading, and thread count. Save presets globally or per model. Smart Resource Management Your phone stays responsive under load. Real-time thermal monitoring with automatic thread reduction under pressure and request rejection at critical temperatures. Memory-aware model loading with conservative budgeting. Configurable request queues and per-request timeouts. Built for Developers - Live API docs with copy-paste curl examples - Structured logging (debug, info, warning, error) - One-tap copy for server addresses and API keys What's Inside Dashboard with one-tap server control, model manager with download progress, complete settings hub (server, inference, security, API keys, developer tools), and guided onboarding for first-time setup.

  • This app hasn’t received enough ratings or reviews to display an overview.

- Fixed the runtime for Gemma 4 models

The developer, Linosec, indicated that the app’s privacy practices may include handling of data as described below. For more information, see the developer’s privacy policy .

  • Data Not Collected

    The developer does not collect any data from this app.

    Privacy practices may vary, for example, based on the features you use or your age. Learn More

    The developer has not yet indicated which accessibility features this app supports. Learn More

    Seller
    • Linosec
    Size
    • 570.4 MB
    Category
    • Business
    Compatibility
    Requires iOS 16.4 or later.
    • iPhone
      Requires iOS 16.4 or later.
    • iPad
      Requires iPadOS 16.4 or later.
    • Mac
      Requires macOS 13.3 or later and a Mac with Apple M1 chip or later.
    • Apple Vision
      Requires visionOS 1.0 or later.
    Languages
    • English
    Age Rating
    18+
    • 18+
    • Infrequent
      Realistic Violence
      Profanity or Crude Humor
      Mature or Suggestive Themes
      Horror/Fear Themes
      Medical Treatment information
      Alcohol, Tobacco, Drug Use or References
      Sexual Content or Nudity
      Guns or Other Weapons

      Contains
      User-Generated Content
      Health or Wellness Topics
    Copyright
    • © Linosec