Private LLM - Local AI Chat

Name: Private LLM - Local AI Chat
Price: 9.99 NZD
Rating: 3.8 (4 reviews)
Author: Numen Technologies Limited

Local Offline Private AI Chat

$9.99

4 Ratings
Ages

13+
Years
Category

Utilities
Developer

Numen Technologies Limited
Language

EN
English
Size

1.4
GB

Discover the Ultimate Privacy-Focused AI Assistant on iOS: Private LLM Unlock a new realm of productivity and creativity on your iPhone and iPad with Private LLM, the premier AI assistant designed with your privacy in mind. Available for a one-time purchase, it offers a range of AI capabilities without needing a subscription. Experience advanced on-device AI that keeps your interactions confidential and offline. Why Private LLM is Your Go-To AI Companion: - Exclusive AI Model Selection: Choose from a diverse set of open-source LLM models optimized for performance and perplexity on iOS with state of the art OmniQuant quantization: including models from Llama 2, Llama 3.2, Llama 3.1, Google Gemma 2, Gemma 3, Microsoft Phi-3, Mistral 7B, Qwen 2.5, Qwen 3, StableLM 3B and many more. Whether you need help with creative brainstorming, coding, or daily questions, customize your AI experience to meet your unique needs. - Integrated with Siri & Shortcuts: Enhance your AI interactions with Siri commands and customizable Shortcuts. Private LLM seamlessly fits within your Apple ecosystem, making your digital assistant more accessible. - Customizable Interactions: Tailor your AI's responses and interactions with customizable system prompts to match your preferences and needs. - Uncompromised Privacy and Security: With Private LLM, your conversations stay confidential and on your device. Our advanced on-device AI performs robust computing without risking data compromise or needing an internet connection. - Family Sharing & Offline Capabilities: Benefit from a one-time purchase that includes Family Sharing. Download models as needed and enjoy the full functionality of your AI assistant, even without internet access. Supported LLM Model families: - DeepSeek R1 Distill based models - Phi 4 based models - Qwen 3 based models (Qwen3-4B-Instruct-2507) - Qwen 2.5 based models (0.5B, 1.5B, 3B and 7B) - Qwen 2.5 Coder based Models (0.5B, 1.5B, 3B, 7B and 14B) - Llama 3.1 8B based models - Llama 3.2 1B and 3B based models - Google Gemma 2 2B and 9B based models - Google Gemma 3 1B based models - Mistral 7B based models - Yi 6B based models For a full list of supported models, including detailed specifications, please visit privatellm.app/models. Private LLM is not just a chatbot; it's a comprehensive AI companion designed to respect your privacy while providing versatile, on-demand assistance. Whether you're enhancing your creative writing, tackling complex programming challenges, or just seeking answers, Private LLM adapts to meet your needs while keeping your data secure. Start your journey with Private LLM today and elevate your productivity and creative projects with the most private AI assistant for iOS devices. Private LLM is a better alternative to generic llama.cpp and MLX wrappers apps like Enchanted, Ollama, LLM Farm, LM Studio, Locally AI, RecurseChat, etc on three fronts: 1. Private LLM uses a faster and highly-optimized mlc-llm based inference engine. 2. Models in Private LLM are quantized using the state of the art quantization algorithms like OmniQuant, while competing apps use naive round-to-nearest quantization. 3. Private LLM is a fully native app built using C++, Metal and Swift with deep integrations with iOS and iPadOS, while many of the competing apps are bloated and non-native Electron or Flutter based apps. Please note that Private LLM only supports inference with text based LLMs. Model support varies by device capabilities.

Ratings & Reviews

3.8

out of 5

4 Ratings

Good for the end of Shortcuts
24/06/2024

PaddyPatPat
Really interesting app! Thank you!Just note that like in the various sample shortcuts, it’s best used at the end of your shortcut and then drop the result to the clipboard. I could never convince it to open another app afterwards and put the result in there directly.
Good for the end of Shortcuts
24/06/2024

PaddyPatPat

Really interesting app! Thank you!Just note that like in the various sample shortcuts, it’s best used at the end of your shortcut and then drop the result to the clipboard. I could never convince it to open another app afterwards and put the result in there directly.
Intriguing but
27/06/2023

iPaddling
Writes excellent English and responds in roughly the same time as it would take to manually type the response. A long input, eg re-write this text (a whole web page), will often have no response at all.I don’t know how it manages to work offline but it seems to work in airplane mode.Generally the answers seem a bit limited and even more factually suspect than with ChatGPT and it will often insist on the same answer after being corrected. (The PM of NZ is Jacinda Ardern)I’m hoping that with Shortcuts support coming it could be good for writing prettier versions of rough text.

Developer Response
Thanks for the feedback! The app currently contains a quantised 3B parameter decoder-only (aka GPT) LLM that runs on your device, and the app makes no network connections, whatsoever. This is how it works in Airplane mode, while apps like ChatGPT cannot. This is also the reason why the app is a fairly large download (1.6GB), even with data compression.The context length of the current model in the app is 2048 tokens or about 1500 words (a token roughly corresponds to ~0.75 words). The context length of an LLM model is the most amount of text that the model can attend to. For comparison, the baseline GPT-3.5 and GPT-4 models that ChatGPT uses have context lengths of 4096 and 8192 tokens, respectively. In some ways it isn't a fair comparison, because the former (our app) is an app runs on your iPhone, while the other needs multiple large servers, and an active internet connection from your phone to those servers, to run. Anyway, improvements to the context length within the realm of possibilities and an active area of research. I can't promise anything but we might be able to increase the context length by a bit, soon. We're also experimenting with newer, larger models which have longer context lengths; although, they'll only work on newer iPhones and iPads.WRT the model's factual knowledge, since the app doesn't connect to the internet, its knowledge is limited to what the model's training data contained. Incidentally, I asked ChatGPT (3.5) the same question question, and it came up with the same incorrect response that you noted.Shortcuts integration will indeed ship later this week, we hope you'll like it!
Intriguing but
27/06/2023

iPaddling

Writes excellent English and responds in roughly the same time as it would take to manually type the response. A long input, eg re-write this text (a whole web page), will often have no response at all.I don’t know how it manages to work offline but it seems to work in airplane mode.Generally the answers seem a bit limited and even more factually suspect than with ChatGPT and it will often insist on the same answer after being corrected. (The PM of NZ is Jacinda Ardern)I’m hoping that with Shortcuts support coming it could be good for writing prettier versions of rough text.
Developer Response 28/06/2023
Thanks for the feedback! The app currently contains a quantised 3B parameter decoder-only (aka GPT) LLM that runs on your device, and the app makes no network connections, whatsoever. This is how it works in Airplane mode, while apps like ChatGPT cannot. This is also the reason why the app is a fairly large download (1.6GB), even with data compression.The context length of the current model in the app is 2048 tokens or about 1500 words (a token roughly corresponds to ~0.75 words). The context length of an LLM model is the most amount of text that the model can attend to. For comparison, the baseline GPT-3.5 and GPT-4 models that ChatGPT uses have context lengths of 4096 and 8192 tokens, respectively. In some ways it isn't a fair comparison, because the former (our app) is an app runs on your iPhone, while the other needs multiple large servers, and an active internet connection from your phone to those servers, to run. Anyway, improvements to the context length within the realm of possibilities and an active area of research. I can't promise anything but we might be able to increase the context length by a bit, soon. We're also experimenting with newer, larger models which have longer context lengths; although, they'll only work on newer iPhones and iPads.WRT the model's factual knowledge, since the app doesn't connect to the internet, its knowledge is limited to what the model's training data contained. Incidentally, I asked ChatGPT (3.5) the same question question, and it came up with the same incorrect response that you noted.Shortcuts integration will indeed ship later this week, we hope you'll like it!