Is RocketWhisper free to use?

Yes, you can try all features for free. After the trial period, a one-time purchase license (Personal: ¥4,800 / ~$32) is required. No monthly fees.

Does RocketWhisper work offline without internet?

Yes, RocketWhisper works completely offline. All audio processing happens locally on your Mac. No data is ever sent to external servers.

Does it support Apple Silicon (M1/M2/M3/M4)?

Yes, RocketWhisper runs natively on Apple Silicon. It fully utilizes M1/M2/M3/M4 chip performance for fast speech recognition. It also works on Intel Macs.

Can it create subtitles from video files?

Yes, RocketWhisper can transcribe video files (MP4, MKV, AVI, MOV, WebM) and export subtitles in SRT and VTT formats.

How is it different from other transcription services?

RocketWhisper processes everything locally so your audio data never leaves your Mac, ensuring privacy. It’s also a one-time purchase with no recurring subscription fees.

RocketWhisper for Mac - AI Speech Recognition & Transcription

Why RocketWhisper?

What macOS Built-in Dictation Can't Do

RocketWhisper provides advanced features not available in macOS built-in dictation.

macOS Built-in Dictation

× No custom terminology dictionary
× No auto-correction rules for misrecognition
× No AI-powered text formatting
× Cannot edit selected text with AI
× Limited voice commands
× No per-app processing modes
× No punctuation control
× Cannot launch apps by voice
× No voice search
× Sends audio to Apple's servers

RocketWhisper PRO

✓ Word Dictionary for company names, technical terms
✓ Auto-correction rules with regex support
✓ AI formatting with Apple Intelligence + GPT-4o / Claude / Gemini
✓ AI Commands to edit selected text by voice
✓ Voice commands like "new line", "delete"
✓ App-specific modes with auto-switching
✓ 7-stage auto punctuation engine
✓ Voice Launcher for apps & URLs
✓ Voice Search - "Search for..." triggers Google
✓ Apple SpeechAnalyzer for blazing-fast recognition (macOS 26+)
✓ 100% local processing, no data sent

Features

19 Premium Features

Uncompromising features designed for professional demands.

✨

Apple SpeechAnalyzer v2.0 NEW

Native Apple speech recognition on macOS 26. No model download needed, 2x faster than WhisperKit. Switch with one click in Settings. WhisperKit still available on macOS 14-15.

🤖

Apple Intelligence LLM v2.0 NEW

On-device ~3B Foundation Models on macOS 26. No API keys, no internet — AI text formatting works completely offline. Best for grammar fixes and light formatting. Cloud APIs also available for advanced tasks.

🎤

High-Accuracy Recognition

High-accuracy recognition powered by Neural Engine and CoreML. Choose from 4 WhisperKit models + Apple SpeechAnalyzer.

🔒

Fully Offline

All speech recognition is processed on-device. No internet required. Safely transcribe confidential information.

⌨

Global Shortcut

Default ⌥Space starts voice input from any app. Customizable. Right Option key also supported.

📚

Word Dictionary Exclusive

Register technical terms, company names, personal names, and acronyms to dramatically improve recognition. Not available in macOS built-in dictation.

✨

AI Text Formatting

Supports 6 providers: Apple Intelligence, OpenAI, Claude, Gemini, Groq, and Local LLM. Apple Intelligence requires no API key. Grammar correction, business style, summarization, translation auto-processed.

🤖

AI Commands

Select text, press ⌃⇧Space, and give voice instructions. "Make formal", "Translate to English", "Summarize" - AI instantly edits selected text.

💡

Custom Instructions NEW

Pre-assign AI processing to dedicated shortcuts. Press shortcut → speak → press again for instant translation, summarization, grammar fix and more. 4 presets included, up to 20 custom instructions.

🔍

Voice Search

"Search for...", "What is...", "Look up..." - 10 voice command patterns instantly trigger Google search.

🚀

Voice Launcher

Launch apps or open URLs by voice. Say a keyword to instantly access your favorite tools.

💻

App-Specific Modes

Automatically apply different processing settings per app. Punctuation for editors, casual style for chat.

💬

Voice Commands

Edit text hands-free with voice commands like "new line", "paragraph", "delete". 7 built-in commands.

✎

Auto Punctuation

7-stage punctuation rules optimized for natural text output, automatically inserting commas, periods, and question marks.

🛠

Auto-Correction Rules

Regex-supported correction rules for automatic misrecognition fixes. 27 hallucination filters built-in.

🎶

Floating Waveform Indicator NEW

Mini equalizer-style waveform bar during recording. Draggable, always-on-top for visual confirmation.

⏬

Right Option Hold Mode NEW

Record while holding Right Option, auto-stop on release. Push-to-Talk style for intuitive voice input.

🌐

Fn Key Push-to-Talk NEW

Push-to-Talk with Fn key (🌐). Double-tap to toggle continuous recording. Same feel as Wispr Flow or macOS Dictation.

📂

Batch Processing

Batch transcribe multiple audio files. Drag & drop to add, export as TXT, SRT, or VTT format.

Processing Pipeline

6-Stage Text Processing Pipeline

Intelligent processing flow that automatically transforms recognized text into high-quality output.

Stage 0

🚀 Launcher

Launch apps by keyword

→

Stage 0.5

🔍 Voice Search

"Search for..." triggers Google

→

Stage 1

💬 Voice Commands

Detect new line, delete, etc.

→

Stage 2

📚 Dictionary & Correction

Term replacement & fixes

→

Stage 3

✎ Punctuation

7 rules for natural punctuation

→

Stage 4

✨ AI Formatting

LLM polishes the text

AI Integration

Integrated with 5 AI Providers

Choose the optimal AI based on your use case, budget, and privacy requirements.

OpenAI

GPT-4o
GPT-4o mini
GPT-4 Turbo

Claude

Sonnet 4.5
Haiku 4.5
Opus 4.5

Gemini

2.5 Pro
2.5 Flash
2.0 Flash

Groq

LLaMA 3.3 70B
LLaMA 3.1 8B
Ultra-fast, free tier

Local LLM

LM Studio
Ollama
Fully private

Built-in Templates

💼 Business

🙌 Casual

📑 Summary

🌐 Translation

🔧 Grammar Fix

✏ Custom

Whisper Models

4 AI Models

Select the optimal model based on speed and accuracy balance. All run on-device.

Model	Size	Speed	Use Case
Small	500 MB	⚡⚡⚡⚡	Real-time input
Medium	1.5 GB	⚡⚡⚡	Balanced
Large V3 Turbo Recommended	1.6 GB	⚡⚡⚡	High accuracy & speed
Large V3	3.0 GB	⚡⚡	Maximum accuracy

* For Japanese speech recognition, Large V3 Turbo or higher is recommended. Small/Medium may have reduced accuracy for kanji and katakana words.

Specifications

System Requirements

💻 System Requirements

macOS 14.0 Sonoma or later
Apple Silicon recommended (M1 / M2 / M3 / M4)
RAM 8GB or more (16GB recommended)
Storage 200MB + models (up to 3GB)
Microphone input

🎤 Input/Output

Input: Microphone (real-time recording)
Output: Direct text input / Clipboard
Shortcut (⌥Space) / Right Option / Fn key (tap & Push-to-Talk)
AI Commands (⌃⇧Space) for selected text editing
Auto-switching based on app detection

🌐 Supported Languages

Japanese (primary target)
English / Chinese / Korean
French / German / Spanish
Portuguese / Italian / Russian

🔐 Privacy

100% local speech recognition
No external transmission of audio data
AI formatting uses your API keys directly
App Sandbox + Hardened Runtime

Frequently Asked Questions

Is RocketWhisper free?+

All features are available for free during the trial period. After that, a one-time license purchase (Personal: ¥4,800 / ~$32) is required. No monthly subscription fees ever.

Does it work without internet?+

Yes, completely offline. All audio data is processed locally on your Mac and is never sent to any external server.

How accurate is the recognition?+

Powered by OpenAI Whisper’s latest model (large-v3-turbo), it delivers industry-leading accuracy. 73 hallucination countermeasures and custom vocabulary support ensure practical results.

Does it support Apple Silicon?+

Yes, runs natively on Apple Silicon (M1/M2/M3/M4) for optimal performance. Also works on Intel Macs.

Can I transcribe meetings?+

Yes, both real-time and from recorded audio/video files via batch processing. AI processing can automatically summarize and reformat the text.

Can I create video subtitles?+

Yes, transcribes video files (MP4, MKV, AVI, MOV, WebM) and exports subtitles in SRT and VTT formats.

Voice to Text. Instant. Accurate. Fully Offline.