NoemaAI

Offline AI with Datasets

Free

Gemma 4 support, better GGUF imports, smarter retrieval, live prompt progress, and new memory plus system prompt customization. Noema brings large language model intelligence to all your devices, fully offline. Download lightweight models directly from Hugging Face, connect supported remote endpoints, and pair models with curated textbooks and your own PDFs or EPUBs. The privacy-first design means your data never leaves your device when running locally, whether you are on iPhone, Mac, or visionOS. - Native macOS app: Run the full Noema experience on your desktop with a rebuilt interface that feels at home on macOS. - visionOS support: Use Noema in spatial computing environments, with windows you can place around your workspace. - Noema Relay: Connect your iPhone to your Mac via CloudKit, with no local Wi-Fi required, so one device can host a model while another becomes the client. - Vision support for models: Attach photographs to your prompts and use multimodal models for on-device image understanding and analysis. - Open Textbook Library integration: Browse and import entire textbooks through the built-in Explore view; Noema indexes them locally so you can search and retrieve relevant passages on demand. - Bring your own data: Add personal documents in PDF or EPUB formats, which are embedded and indexed on-device to power retrieval-augmented generation. - Integrated Hugging Face search: Discover and install quantized models from the Hugging Face Hub with one-tap installation, automatic dependency management, and real-time download progress. - Remote model support: Connect to supported remote endpoints including OpenRouter and LM Studio, with updated LM Studio REST v1 compatibility and a smoother model download flow through Explore. - Expanded model runtime support: Run models across GGUF, MLX, ExecuTorch, CoreML, and Apple Foundation Model support, giving you flexible on-device options across Apple hardware. - RAM check and model size helper: A built-in advisor estimates each model’s memory footprint and shows when it fits your device’s budget; it can also estimate the maximum context length that fits in RAM. - Advanced settings for power users: Fine-tune context length, quantization, and GPU acceleration; enable tool calling for built-in search and other functions; and customize model parameters for optimal performance. - Built-in tool calling and Python support: Use integrated tools, including Python, to extend model capabilities for more advanced workflows. - Built-in search and RAG: Use integrated search tools and retrieval-augmented generation to query your data without hitting context limits. - Localization upgrades: Experience Noema in 10 languages, so international teams can work in the interface that suits them best. - Private and offline by default: Local models run entirely on-device, and your conversations and files stay on your device unless you choose to use a connected remote provider.

  • 4.2
    out of 5
    15 Ratings

• Updated llama.cpp for Gemma 4 support, including fixes for previously known Gemma 4 issues • Improved GGUF import reliability, with better detection for chat templates, JSON configs, and multimodal projector files • Added a clearer download experience for CML models, including visible progress during downloads • Improved smart retrieval so large-context models make better use of available context with PDFs and long documents • Added a new Prompt Processing card in chat with live progress feedback • Fixed prompt processing progress getting stuck at 0% and corrected its placement after tool calls • Fixed scrolling issues in Model Settings caused by repeated memory fit checks • Updated VRAM estimates and maximum context recommendations to reflect KV cache quantization changes • Added support for Memory and system prompt customization • Refreshed curated models with Gemma 4 and Qwen 3 1.7B support

The developer, Alexandru Stamate, indicated that the app’s privacy practices may include handling of data as described below. For more information, see the developer’s privacy policy .

  • Data Not Collected

    The developer does not collect any data from this app.

    Privacy practices may vary, for example, based on the features you use or your age. Learn More

    The developer indicated that this app supports the following accessibility features. Learn More

    • Supported Features

      • VoiceOver

      • Larger Text

      • Dark Interface

      • Reduced Motion

    Seller
    • Alexandru Stamate
    Size
    • 111.5 MB
    Category
    • Productivity
    Compatibility
    Requires iOS 18 or later.
    • iPhone
      Requires iOS 18 or later.
    • iPad
      Requires iPadOS 18 or later.
    • Mac
      Requires macOS 26.0 or later.
    • Apple Vision
      Requires visionOS 26.0 or later.
    Languages
    English and 9 more
    • English, Arabic, French, German, Hindi, Japanese, Korean, Romanian, Simplified Chinese, Turkish
    Age Rating
    13+
    • 13+
    • Infrequent
      Medical Treatment information
    Copyright
    • © 2026 Noema