NoemaAI

Offline AI with Datasets

Free

Now 100% free. Unlimited web search and every feature. Run bigger local models with FlashAttention + V-cache quantization. Benchmark speed. Cleaner UI. Latest llama.cpp. Noema brings large‑language‑model intelligence to your iPhone and iPad, fully offline. Download lightweight models directly from Hugging Face and pair them with curated textbooks and your own PDFs. The privacy‑first design means your data never leaves your device. - Open Textbook Library integration: Browse and import entire textbooks from OTL through the built‑in Explore view; Noema indexes them locally so you can search and retrieve relevant passages on demand. - Bring your own data: Add personal documents in PDF or EPUB formats, which are embedded and indexed on‑device to power retrieval‑augmented generation. - Integrated Hugging Face search: Discover and install any quantized model from the Hugging Face hub, no pre‑set list, with one‑tap installation, automatic dependency management and real‑time download progress. - RAM check and model size helper: A built‑in advisor estimates each model’s memory footprint and shows a badge when it fits your device’s budget; it can also compute the maximum context length that fits in RAM. - Triple‑backend support: Run models in GGUF, MLX or Liquid AI’s SLM format: a first for mobile LLM apps. This broad compatibility lets you choose between high‑performance quantized models, Apple‑optimised MLX models and Liquid AI’s lightweight SLMs. - Low‑RAM, high‑knowledge advantage: Noema shifts knowledge into compact datasets rather than bloated weights, allowing bigger knowledge bases on low‑memory devices. - Advanced settings for power users: Fine‑tune context length, quantization and GPU acceleration; enable tool‑calling for built‑in search and other functions; and customise model parameters for optimal performance. - Built‑in tool calling and RAG: Use integrated search tools and retrieval‑augmented generation to query your data without hitting context limits. - Private and offline: All processing happens locally, and your conversations and files never leave the device.

  • This app has not received enough ratings or reviews to display an overview.

Noema is now completely free—unlimited web search and all features are included. FlashAttention and V-cache quantization cut memory use so larger models fit on more devices. You can benchmark any model or optimization to compare prompt-in/s and token/s. Input fields and remote endpoint forms are redesigned for clarity, and our llama.cpp core is updated to the latest upstream.

The developer, Alexandru Stamate, indicated that the app’s privacy practices may include handling of data as described below. For more information, see the developer’s privacy policy .

  • Data Not Collected

    The developer does not collect any data from this app.

    Privacy practices may vary based, for example, on the features you use or your age. Learn More

    The developer has not yet indicated which accessibility features this app supports. Learn More

    • Provider
      • Alexandru Stamate
    • Size
      • 81 MB
    • Category
      • Productivity
    • Compatibility
      Requires iOS 18.6 or later.
      • iPhone
        Requires iOS 18.6 or later.
      • iPad
        Requires iPadOS 18.6 or later.
    • Languages
      • English
    • Age Rating
      13+
      • 13+
      • Infrequent
        Medical Treatment Information
    • Copyright
      • © 2025 Noema