WhisperShortcuts
Open Source
Only for Mac
$3.99
Key Features
- Accurate speech-to-text and dictation powered by Google Gemini
- AI assistant system-wide via voice or selected text
- Custom keyboard shortcuts for writing, coding, and productivity
Privacy & Security
- Use your own Google API key (stored in macOS Keychain); no key needed for offline Whisper
Offline Speech Recognition
- Offline transcription with local Whisper—dictate without an internet connection
Smart Improvement
- System prompts for dictation and AI modes improve automatically based on your usage
Meeting Transcription
- Live streaming: start "Transcribe Meeting" for a continuously updated transcript (calls, long sessions)
Requirements
- macOS 15.5 or later.
more - Code quality and reliability improvements across core systems
7.10 21h ago
• Reliability: Fixed a rare issue where recording could stay stuck on if you started a new recording immediately after the previous one ended.
• Live Meeting: Starting a meeting from a new chat after a previous one ended now attaches the recording to the current session.
• Google tools: Calendar and task actions show clearer confirmation details in chat when they succeed.
• Gmail: Improved stability when opening large search results.
• Google sign-in & Calendar: More robust handling of special characters in tokens, codes, and event IDs.
• Settings: “Reset all to defaults” now also clears chat sessions, meeting transcripts, and saved system prompts (API keys and Google tokens remain in Keychain). The confirmation text reflects this accurately.
7.9 6d ago
- Unified chat + live meeting: The chat window (formerly “Open Gemini”) is the main surface; live meeting lives in the same place with session sidebar and related fixes.
- Deeper AI + Google in chat: Multi-provider (Gemini + Grok), tool/function calling, Google Calendar/Tasks, read-only Gmail, and many stability fixes.
- Streamlined settings: Renamed modes (“Dictate Prompt” etc.), per-mode system prompts, removed old modes (Prompt Read, chat read-aloud), and UI/privacy defaults (e.g. masked API keys, sidebar, pins, chat management).
7.3 Apr 27
Gemini model selection: Fixed the model picker not prefilling correctly on the first open after a cold launch.
6.7.1 Apr 4
- Feature Removal: Read Aloud, Prompt & Read, and Improve-from-voice have been removed to focus the app on Speech-to-Text, Speech-to-Prompt, and Gemini Chat.
- Gemini Chat Enhancements: Added a prompt queue for sequential processing, prefill from selection with visual chips, and a window toggle shortcut.
- Opt-in Controls: Screenshots in Prompt Mode and auto-paste for dictation are now strictly opt-in to improve privacy and reduce permission prompts.
- Reliability: Fixed crashes related to main-thread rules and focus-loss bugs, alongside general internal code cleanup.
6.7 Apr 2
- Gemini Chat: Attachments, paste handling, markdown tables, session memory (/remember, /context), and in recent releases syntax-highlighted code blocks, LaTeX-style math, and inline images, with steadier layout and window behavior.
- Meeting mode: Split window, rolling summary, generated summaries, past meetings in one place, and refinements to controls, titles, and the transcript experience.
- Elsewhere: Screenshot context in prompt mode, Whisper glossary and model tuning for dictation/prompt/improvement, a central Stop control, no automatic app termination, optional close-Settings-on-focus-loss, and a fix for heavy disk writes that could get the app terminated by macOS.
6.6.3 Mar 22
- Gemini Chat offers a dedicated window with a global shortcut and persistent sessions as well as slash commands with autocomplete for efficient command execution.
- Grounding sources and citations are displayed inline in a FlowLayout and supplemented by an integrated screenshot function for context enrichment.
6.4.5 Mar 5
- Smart Improvement & AI: System prompts now improve automatically based on usage, powered by Gemini for more precise analysis and history management.
- User Context & History: Introduction of configurable limits for user context and system prompts, plus tab-specific history for AI generation.
- UI/UX & Settings: Optimized prompt editor and direct access to interaction and transcript folders within the app.
- Live Meeting & Stability: New safety safeguards for recording duration, consistent timestamps, and improved transcription control, including a cancel shortcut.
6.0.0 Feb 20
- Introduced live meeting recording and transcription, including dedicated settings and direct folder access.
- Implementation of a global rate limit with UI notifications for transcription and TTS services.
- Optimized TTS workflow and simplified configuration through consolidation of service logic.
5.3.6 Feb 5
Read Aloud and Prompt & Read now support an adjustable playback speed so you can listen faster or slower.
5.3.4 Feb 2
- Introduced conversation history for prompt modes to enable more coherent, context-aware AI interactions.
- Enhanced application reliability through optimized TTS processing, updated API endpoints, and improved audio validation.
- Added a configurable timeout for transcription history to prevent long-running requests from blocking the app.
5.3.3 Feb 1
- Added configurable recording safeguards to prevent unintended API usage
5.3.2 Jan 25
Optimized TTS and transcription performance and user experience by implementing intelligent chunking, progress notifications, and increased API concurrency.
5.3.1 Jan 6
Merged a PR enabling chunked transcription for long recordings via parallel processing. Thank you Unity2Blender : )
5.3 Jan 5
- New Read Aloud Feature: Text-to-speech with voice settings.
- Enhanced Settings UI: Refactored interface, improved model messaging, streamlined UX.
- Improved TTS & Prompt Settings: Enhanced settings, new model support, UI improvements.
5.2.6 12/28/2025
Added newest Gemini 3 Flash model.
5.2.5 12/18/2025
New Feature: Introducing "Read Selected Text," allowing you to select and listen to text from any application with customizable shortcuts, voice options, and playback controls.
Enhanced User Experience: Enjoy a streamlined interface, clearer menu terminology like "Prompt & Read," and improved user feedback.
Increased Reliability: Benefit from smart error handling and better troubleshooting for a smoother overall experience.
5.2.4 12/18/2025
Improved error messages with clear instructions for downloading missing offline models.
Higher code quality for a more stable and maintainable app.
Updated documentation and screenshots for new open-source features.
5.2.2 12/17/2025
Improved transcription reliability with automatic retries during network or service interruptions.
Large audio files are now handled more reliably with improved uploads and clearer error reporting.
5.2.1 12/15/2025
Added the retry button when requests to the Google API fail.
5.2 12/14/2025
Offline support has been integrated through the use of offline Whisper large language models.
5.1 12/12/2025
- Removed Open AI Integration, as the quality of the Open AI APIs has degraded significantly.
- Removed the Voice Response Mode as a result, as the new Gemini APIs did not support it.
5 11/28/2025
Improved default system prompts.
4.1.1 11/26/2025
- Enhanced integration of gemini.
- Added the functionality to change the position of the Notification Window.
4.1 11/24/2025
- Added Google Gemini support, enabling Gemini models alongside OpenAI in all modes
- Removed "Read aloud" Feature
4 11/22/2025
- Code quality and reliability improvements across core systems
more Version 7.10 21h ago
Data Not Collected The developer does not collect any data from this app.