Hal Universal

Private AI with Memory

Free · Designed for iPad. Not verified for macOS.

On-device AI with persistent memory, multiple model voices, and full transparency about how it works. Nothing leaves your device when running local models. Hal Universal Private. Powerful. Personal. Hal Universal is an on-device AI assistant for iPhone, iPad, and Mac. A thoughtful, private companion for conversation, reflection, and creativity — and a window into how AI actually works. Multiple AI Voices Hal ships with Apple Intelligence built in — always available, no download required. For users who want a fully private on-device experience, Hal offers a curated library of tested local models: Gemma 4 E2B, Llama 3.2 3B, Qwen 3.5 2B, and Dolphin 3.0. Each has a distinct voice. Download any or all — they run entirely on your device, with nothing sent to any server. Genuinely Private When you use a local MLX model, your conversations never leave your iPhone. No network calls. No server. When using Apple Intelligence, inference is on-device when offline; when connected, Apple's Private Cloud Compute may be used (encrypted in transit, processed in non-persistent memory on Apple-controlled servers). Memory That Persists Hal remembers across conversations. A semantic memory system weighted by recency and relevance retrieves what's useful when it's useful. The optional Nomic retrieval upgrade dramatically improves recall on specific facts (opt-in in the Model Library, 522 MB). When a turn's context would exceed the active model's window, Hal posts a brief explanation in the conversation rather than silently dropping content. When self-knowledge sections are compressed to fit, a small "condensed" badge appears in the message footer. Self Model Hal builds a structured self-model over time from patterns it notices in your conversations — values, preferences, themes. You can browse the Self Model viewer to see what Hal has crystallized about you, and toggle each entry between private (visible only to you) and shareable (available for export). Every entry is editable. Nothing leaves your device without your action. Salon Mode Put up to four AI voices in conversation. Independent mode (each voice isolated) or context-aware (each builds on what came before). Accessible in Settings → Power User Mode. Transparency as Architecture Hal shows you the work. Token counts, memory retrieval, model identity, inference timing, compression events — all visible. When Hal remembers something, it tells you how. When it doesn't know something, it says so. Power User Controls Tune memory depth, recency weighting, semantic similarity, half-life decay, temperature, and RAG settings. Per-model settings: each AI voice remembers your preferences independently. For Users Who Want to Understand AI Hal is an educational window into how large language models actually work. Not a black box. A transparent system you can inspect, tune, and learn from. Hardware note: Local MLX models are validated on iPhone 16 family and iPhone 17 family. iPhone 15 Pro should work. Older devices may run slowly or not at all. A hardware disclosure appears on first download attempt.

  • This app has not received enough ratings or reviews to display an overview.

Version 2.0 — significant update. - Four curated local AI models now available: Gemma 4 E2B, Llama 3.2 3B, Qwen 3.5 2B, and Dolphin 3.0. Each runs entirely on your device. - Self Model: Hal now builds a structured self-model of you over time from patterns in conversation. Browse and edit it in Settings → Self Model. Each entry is private by default. - Salon Mode: put up to four AI voices in conversation. Independent and context-aware modes. - Per-model settings: temperature, memory depth, RAG configuration saved independently for each model. - Upgraded retrieval (optional): Nomic Embed Text v1.5 available as an opt-in via Model Library — significantly better recall on specific facts than the default. 522 MB. - Per-turn memory pre-flight: when a turn would exceed the active model's working memory, Hal explains why in chat rather than crashing. - Real token streaming: responses stream as they generate. - Background downloads: model downloads continue while the app is backgrounded or the phone is locked. - Unified model status dots: one dot, one meaning across all screens. - Significant performance and stability improvements.

The developer, Mark Friedlander, indicated that the app’s privacy practices may include handling of data as described below. For more information, see the developer’s privacy policy .

  • Data Not Collected

    The developer does not collect any data from this app.

    Privacy practices may vary based, for example, on the features you use or your age. Learn More

    The developer has not yet indicated which accessibility features this app supports. Learn More

    Provider
    • Mark Friedlander
    Size
    • 28.9 MB
    Category
    • Productivity
    Compatibility
    Requires iOS 26.0 or later.
    • iPhone
      Requires iOS 26.0 or later.
    • iPad
      Requires iPadOS 26.0 or later.
    • Mac
      Requires macOS 26.0 or later and a Mac with Apple M1 chip or later.
    • Apple Vision
      Requires visionOS 26.0 or later.
    Languages
    • English
    Age Rating
    4+
    Copyright
    • © 2025-2026 Mark Friedlander