📖Help

Table of Contents

📦 1. Installation

Running the AppImage

RocketWhisper is distributed as an AppImage. No installation is needed -- just download and run.

# 1. Grant execute permission to the downloaded AppImage
chmod +x RocketWhisper-1.2.0-aarch64.AppImage

# 2. Run it
./RocketWhisper-1.2.0-aarch64.AppImage

Environments Without FUSE

AppImages use FUSE. If FUSE is not installed on your system, you can extract and run it as follows:

# Install FUSE
sudo apt install fuse libfuse2

# Or extract and run
./RocketWhisper-1.2.0-aarch64.AppImage --appimage-extract
./squashfs-root/AppRun

📋 2. Required Packages

The following packages are required to use RocketWhisper:

Ubuntu / Debian / Linux Mint / Pop!_OS

sudo apt install pulseaudio-utils xdotool xclip ffmpeg

Fedora

sudo dnf install pulseaudio-utils xdotool xclip ffmpeg

Arch Linux

sudo pacman -S pulseaudio xdotool xclip ffmpeg

Package Details

Package Purpose Required
pulseaudio-utils Microphone recording (parec command) Required
xdotool Keyboard automation / window detection Required
xclip Clipboard access Required
ffmpeg Audio file conversion Optional
Verify installation: which parec xdotool xclip

🚀 3. First Launch

On first launch, RocketWhisper will automatically download the Whisper model.

  1. Run the AppImage
  2. The initial setup screen appears
  3. Select a model (recommended: large-v3-turbo)
  4. Wait for the model download to complete
  5. Once download finishes, the main window appears
Models are saved to ~/.local/share/RocketWhisper/Models/.

⌨️ 4. Hotkeys

Default Hotkeys

Function Hotkey Description
Start/Stop Recording F8 Press to talk. Press again to recognize.
Cancel Escape Cancel recording
AI Command Ctrl + Shift + Space AI processing on selected text

Changing Hotkeys

Go to Settings (gear icon) → "Hotkeys" tab to change hotkeys.

Click the text box and press the desired key combination to set it.

🧠 5. Whisper Models

Model Size Accuracy Speed Recommended RAM
small 466MB Medium Normal 8GB
medium 1.5GB High Somewhat slow 8GB
large-v3-turbo 1.6GB High Fast 8GB
large-v3 2.9GB Highest Slow 16GB
Recommended: large-v3-turbo offers the best balance of accuracy and speed for most environments.

📁 6. Batch Processing (Video Support)

Transcribe multiple audio and video files at once. Video files are automatically processed by extracting audio with FFmpeg.

Supported Formats

How to Use

  1. Click the "Batch Processing" button in the main window
  2. Add files (drag and drop supported)
  3. Select output format (Text / SRT subtitles / VTT subtitles)
  4. Click "Start Processing"
Tip: FFmpeg is required for video file transcription. Install it with sudo apt install ffmpeg.

7. Custom Instructions

Process recognized text with any AI prompt you define. Set up custom instructions for meeting minutes, translation, summarization, and more.

How to Use

  1. Configure an AI provider in Settings → "AI Processing" tab
  2. Enter your custom prompt in the "Custom Instructions" field (e.g., "Format as meeting minutes")
  3. Use the Custom Instructions hotkey to record → when stopped, the recognized text is processed by AI

Use Cases

Tip: Use a local LLM (Ollama, etc.) for completely offline and free custom instructions.

📜 8. Recognition History

All recognition results are automatically saved and can be searched, copied, and reused at any time.

Features

How to Use

Click the "History" button in the main window to open the recognition history window.

🤖 9. AI Command Mode

Give voice instructions to perform AI processing on selected text.

How to Use

  1. Select text in any application
  2. Press the AI Command hotkey (Ctrl + Shift + Space)
  3. Speak your instruction (e.g., "Translate to Japanese", "Summarize this")
  4. Press the hotkey again to execute
  5. Results are displayed in the RocketWhisper window

Setting Up an AI Provider

To use AI Command Mode, you need to configure an AI provider in Settings:

AI Command Mode requires an API key from an AI provider. Configure it in Settings → "AI Processing" tab.

🚀 11. Voice Launcher

Launch applications by speaking specific keywords.

Setup

  1. Go to Settings → "Voice Launcher & Search" tab
  2. Click the "Add" button
  3. Enter a keyword (e.g., "open terminal")
  4. Enter the executable path (e.g., /usr/bin/gnome-terminal)
  5. Click Save
You can also use the "Browse..." button to select a file.

🎯 12. Per-App Processing Modes

Automatically apply different settings for each application.

Preset Modes

App Mapping

  1. Go to Settings → "Per-App Processing" tab
  2. Click "Get Current App" (have the target app in the foreground)
  3. Select the mode to use
  4. Click Save

⚠️ 13. Wayland Environment

Some features are limited under Wayland.

Limitations

Additional Packages

sudo apt install ydotool wl-clipboard
For full functionality, select an "Xorg" or "X11" session at login.

Checking Your Session

echo $XDG_SESSION_TYPE
# "x11" means X11 environment
# "wayland" means Wayland environment

🎮 14. CUDA/GPU Setup

CUDA acceleration is automatically enabled on systems with NVIDIA GPUs.

Supported GPUs

Verification

# Check if NVIDIA driver is recognized
nvidia-smi

# Check CUDA version
nvcc --version
CUDA 12.0 or later is required. If no GPU is available, the app automatically falls back to CPU.

🔧 15. Troubleshooting

Microphone Not Detected

# List PulseAudio sources (input devices)
pactl list sources short

# Recording test
parec --device=0 --rate=16000 --channels=1 --format=s16le | head -c 160000 > test.raw

Hotkey Not Responding

# Test xdotool
xdotool getactivewindow

# Check if running in X11 session
echo $XDG_SESSION_TYPE

AppImage Won't Start

# Install FUSE
sudo apt install fuse libfuse2

# Or extract and run
./RocketWhisper-*.AppImage --appimage-extract
./squashfs-root/AppRun

Garbled Characters

# Install CJK fonts
sudo apt install fonts-noto-cjk

Configuration File Locations

~/.config/RocketWhisper/
├── settings.json # App settings
├── modes.json # Processing mode settings
├── mappings.json # Per-app mappings
├── voice_launcher.json # Voice launcher settings
└── correction_rules.json # Correction rules
Back to Home 💬 Contact Support