RocketWhisper is distributed as an AppImage. No installation is needed -- just download and run.
# 1. Grant execute permission to the downloaded AppImage
chmod +x RocketWhisper-1.2.0-aarch64.AppImage
# 2. Run it
./RocketWhisper-1.2.0-aarch64.AppImage
AppImages use FUSE. If FUSE is not installed on your system, you can extract and run it as follows:
# Install FUSE
sudo apt install fuse libfuse2
# Or extract and run
./RocketWhisper-1.2.0-aarch64.AppImage --appimage-extract
./squashfs-root/AppRun
The following packages are required to use RocketWhisper:
sudo apt install pulseaudio-utils xdotool xclip ffmpeg
sudo dnf install pulseaudio-utils xdotool xclip ffmpeg
sudo pacman -S pulseaudio xdotool xclip ffmpeg
| Package | Purpose | Required |
|---|---|---|
pulseaudio-utils |
Microphone recording (parec command) | Required |
xdotool |
Keyboard automation / window detection | Required |
xclip |
Clipboard access | Required |
ffmpeg |
Audio file conversion | Optional |
which parec xdotool xclip
On first launch, RocketWhisper will automatically download the Whisper model.
~/.local/share/RocketWhisper/Models/.
| Function | Hotkey | Description |
|---|---|---|
| Start/Stop Recording | F8 |
Press to talk. Press again to recognize. |
| Cancel | Escape |
Cancel recording |
| AI Command | Ctrl + Shift + Space |
AI processing on selected text |
Go to Settings (gear icon) → "Hotkeys" tab to change hotkeys.
Click the text box and press the desired key combination to set it.
| Model | Size | Accuracy | Speed | Recommended RAM |
|---|---|---|---|---|
| small | 466MB | Medium | Normal | 8GB |
| medium | 1.5GB | High | Somewhat slow | 8GB |
| large-v3-turbo | 1.6GB | High | Fast | 8GB |
| large-v3 | 2.9GB | Highest | Slow | 16GB |
Transcribe multiple audio and video files at once. Video files are automatically processed by extracting audio with FFmpeg.
sudo apt install ffmpeg.
Process recognized text with any AI prompt you define. Set up custom instructions for meeting minutes, translation, summarization, and more.
All recognition results are automatically saved and can be searched, copied, and reused at any time.
Click the "History" button in the main window to open the recognition history window.
Give voice instructions to perform AI processing on selected text.
Ctrl + Shift + Space)To use AI Command Mode, you need to configure an AI provider in Settings:
When specific phrases are recognized, a browser search is automatically performed.
Enable or disable this feature in Settings → "Voice Launcher & Search" tab.
Launch applications by speaking specific keywords.
/usr/bin/gnome-terminal)Automatically apply different settings for each application.
Some features are limited under Wayland.
ydotool required, may need root privilegeswl-clipboard requiredsudo apt install ydotool wl-clipboard
echo $XDG_SESSION_TYPE
# "x11" means X11 environment
# "wayland" means Wayland environment
CUDA acceleration is automatically enabled on systems with NVIDIA GPUs.
# Check if NVIDIA driver is recognized
nvidia-smi
# Check CUDA version
nvcc --version
# List PulseAudio sources (input devices)
pactl list sources short
# Recording test
parec --device=0 --rate=16000 --channels=1 --format=s16le | head -c 160000 > test.raw
# Test xdotool
xdotool getactivewindow
# Check if running in X11 session
echo $XDG_SESSION_TYPE
# Install FUSE
sudo apt install fuse libfuse2
# Or extract and run
./RocketWhisper-*.AppImage --appimage-extract
./squashfs-root/AppRun
# Install CJK fonts
sudo apt install fonts-noto-cjk
~/.config/RocketWhisper/
├── settings.json # App settings
├── modes.json # Processing mode settings
├── mappings.json # Per-app mappings
├── voice_launcher.json # Voice launcher settings
└── correction_rules.json # Correction rules