Troubleshooting

This guide covers common issues you might encounter when using or developing Voicebox, along with solutions.

Installation Issues

macOS: "App is damaged and can't be opened"

This occurs because the app isn't signed with an Apple Developer certificate.

Solution:

# Remove the quarantine attribute
xattr -cr /Applications/Voicebox.app

Windows: SmartScreen Warning

Windows SmartScreen may warn that the app is unrecognized.

Solution:

  • Click "More info"
  • Click "Run anyway"
This is expected for unsigned applications. We're working on code signing for future releases.

Linux: AppImage Won't Run

Solution:

chmod +x voicebox-*.AppImage
./voicebox-*.AppImage

Server Issues

Backend Server Won't Start

Symptoms:

  • Red status indicator in bottom-left corner
  • "Failed to connect to server" error

Solutions:

Port Already in Use

Check if port 17493 is already in use:

# macOS/Linux
lsof -i :17493

Windows

powershell -Command "Get-NetTCPConnection -LocalPort 17493 -State Listen"

Kill the process using the port:

# macOS/Linux
kill -9 <pid>

Windows

taskkill /PID <pid> /F

Permission Issues

The server binary might not have execute permissions:

# macOS/Linux
chmod +x ~/Library/Application\ Support/sh.voicebox.app/backend/voicebox-server
Check Logs

View server logs for errors:

macOS:

tail -f ~/Library/Application\ Support/sh.voicebox.app/logs/server.log

Windows:

type %APPDATA%\sh.voicebox.app\logs\server.log

flash-attn is not installed Warning in Server Logs

Symptoms:

Warning: flash-attn is not installed. Will only run the manual PyTorch version.
Please install flash-attn for faster inference.

This is harmless. The warning is emitted by our transformer-based engines (Chatterbox / Qwen) on every startup. FlashAttention is an optional acceleration library — when it's not present, PyTorch's built-in scaled-dot-product attention (SDPA) runs instead, which is near-FA2 throughput on modern GPUs. Generation works normally.

Why it shows up on every platform:

  • Windows: flash-attn has no official Windows support. The upstream project (Dao-AILab/flash-attention) still only says it might work, and source builds typically fail on recent CUDA/MSVC combinations.
  • macOS (Apple Silicon): FlashAttention is CUDA-only and doesn't apply here at all. MLX has its own optimized attention kernels.
  • Linux: It's not pinned in our requirements because installing it is fragile and version-sensitive; users who want it install it themselves.

Solutions (all optional):

Ignore it (recommended)

PyTorch SDPA is what actually runs the model, and on Ampere/Ada/Hopper GPUs it's within a few percent of FA2 for our workloads. You won't notice a meaningful speed difference.

Install flash-attn on Linux
pip install flash-attn --no-build-isolation

Requires a matching CUDA toolkit. Build can take 20+ minutes.

Install flash-attn on Windows (community wheels)

Official builds don't exist, but community maintainers publish prebuilt wheels:

Pick the wheel matching your exact CUDA + PyTorch + Python combination. Example:

pip install https://github.com/kingbri1/flash-attention/releases/download/v2.8.3/flash_attn-2.8.3+cu128torch2.8.0cxx11abiFALSE-cp312-cp312-win_amd64.whl

Alternatively, run Voicebox's backend inside WSL2 and use the standard Linux wheels.

Connection Timeout

Symptoms:

  • Long loading times
  • "Connection timeout" errors

Solution:

  • Restart the app
  • Check your firewall settings
  • Ensure localhost is accessible

Generation Issues

First Generation is Very Slow

Symptoms:

  • First generation takes 2-5 minutes
  • Progress indicator stuck at "Loading model..."

Explanation: This is expected behavior. The first generation downloads the selected TTS engine's model and initializes it. Sizes range from 350 MB (Kokoro) to 8 GB (TADA 3B).

Solution:

  • Wait for the initial download to complete (progress is shown in Settings → Models)
  • Subsequent generations reuse the cached model and are much faster
  • Check your internet connection
  • For low-bandwidth setups, start with Kokoro (350 MB) or LuxTTS (300 MB)

Poor Voice Quality

Symptoms:

  • Robotic or unnatural voice
  • Missing emotion or prosody
  • Pronunciation errors

Solutions:

Improve Voice Samples

  • Use 10-30 seconds of clear audio
  • Avoid background noise
  • Ensure consistent speaking tone
  • Add multiple samples from the same speaker

Match Speaking Style

The generated voice will mimic the tone and style of your samples. If your sample is monotone, the generation will be too.

Adjust Text Formatting

  • Use proper punctuation
  • Add commas for natural pauses
  • Capitalize proper nouns

Generation Fails with "Out of Memory"

Symptoms:

  • Generation crashes
  • "CUDA out of memory" or "RuntimeError: out of memory"

Solutions:

Free GPU Memory

Close other GPU-intensive applications:

  • Games
  • Video editors
  • Multiple browser tabs with WebGL

Then restart Voicebox.

Use CPU Mode

If your GPU doesn't have enough VRAM (need 6GB+), use CPU mode:

Settings → Generation → Use CPU instead of GPU

CPU generation is 5-10x slower but uses system RAM instead of VRAM.
Reduce Batch Size

For long text, split it into smaller chunks instead of generating all at once.

MLX "Failed to load the default metallib" (Apple Silicon)

Symptoms:

  • Generation fails with "library not found" or "metallib" errors
  • Server logs reference missing Metal shader libraries

Solutions:

Rebuild the Server Binary
just build-server

The build script bundles MLX Metal shader libraries on Apple Silicon automatically.

Reinstall MLX Dependencies
pip install -r backend/requirements-mlx.txt
Verify Backend Detection

Check Settings → Server Status. Should show Backend: MLX on Apple Silicon. If it shows Backend: PYTORCH, MLX isn't installed correctly.

Audio Issues

No Audio Playback

Symptoms:

  • Generated audio won't play
  • Playback button doesn't respond

Solutions:

  • Check system audio settings
  • Ensure audio output device is connected
  • Try exporting and playing in a media player

Crackling or Distorted Audio

Symptoms:

  • Audio has static or distortion
  • Clipping sounds

Solutions:

  • Check if your input samples have distortion
  • Reduce playback volume
  • Re-generate with cleaner voice samples

Development Issues

Backend Won't Start in Dev Mode

Symptoms:

  • just dev-backend or just dev fails
  • Import errors or module not found

Solutions:

Python Version

Ensure Python 3.11 or higher:

python --version

If not, install Python 3.11+ and recreate the virtual environment.

Virtual Environment

Ensure venv is activated:

# macOS/Linux
source backend/venv/bin/activate

Windows

backend\venv\Scripts\activate

You should see (venv) in your prompt.

Dependencies

Reinstall dependencies — easiest via just:

just setup

Or manually:

cd backend
pip install -r requirements.txt
pip install --no-deps chatterbox-tts
pip install --no-deps hume-tada
pip install git+https://github.com/QwenLM/Qwen3-TTS.git

Tauri Build Fails

Symptoms:

  • bun run tauri build fails
  • Rust compilation errors

Solutions:

# Clean build artifacts
cd tauri/src-tauri
cargo clean

# Update Rust
rustup update

# Try building again
cd ../..
bun run tauri build

OpenAPI Client Generation Fails

Symptoms:

  • ./scripts/generate-api.sh fails
  • "Failed to fetch schema" error

Solutions:

Ensure Backend is Running

curl http://localhost:17493/openapi.json

Should return JSON. If not, start the backend.

Check Port

Ensure nothing else is using port 17493

Regenerate Manually

cd backend
source venv/bin/activate
uvicorn main:app --reload --port 17493

In another terminal

./scripts/generate-api.sh

Database Issues

"Database is locked" Error

Symptoms:

  • Profile or generation operations fail
  • SQLite lock errors

Solutions:

  • Close all Voicebox instances
  • Delete the lock file:
  # macOS
  rm ~/Library/Application\ Support/sh.voicebox.app/data/voicebox.db-shm
  rm ~/Library/Application\ Support/sh.voicebox.app/data/voicebox.db-wal

Corrupted Database

Symptoms:

  • App crashes on launch
  • Data missing or corrupted

Solutions:

This will delete all your voice profiles and generation history. Export important profiles first if possible.
# macOS
rm ~/Library/Application\ Support/sh.voicebox.app/data/voicebox.db

# Windows
del %APPDATA%\sh.voicebox.app\data\voicebox.db

Restart the app to create a fresh database.

Model Issues

Model Download Fails

Symptoms:

  • "Failed to download model" error
  • Stuck at "Downloading..."

Solutions:

  • Check your internet connection
  • Check HuggingFace Hub status
  • Try using a VPN if HuggingFace is blocked in your region
  • Manually download via the HuggingFace CLI and place in the cache directory:
  pip install huggingface_hub
  huggingface-cli download Qwen/Qwen3-TTS-12Hz-1.7B-Base

Wrong Model Version

Symptoms:

  • Generation quality suddenly degraded
  • Different voice output

Solutions: Clear the model cache and re-download. Replace the Qwen* glob with the engine org prefix for other engines (ResembleAI* for Chatterbox, HumeAI* for TADA, hexgrad* for Kokoro, etc.) or use DELETE /models/{name} via the API.

# macOS / Linux
rm -rf ~/.cache/huggingface/hub/models--Qwen*

# Windows
rmdir /s %USERPROFILE%\.cache\huggingface\hub\models--Qwen*

Performance Issues

Slow Generation on GPU

Symptoms:

  • Generation slower than expected
  • GPU not being utilized

Solutions:

Verify CUDA Installation
nvidia-smi

Should show your GPU. If not, install CUDA drivers.

Check GPU Selection

If you have multiple GPUs, ensure Voicebox is using the right one.

Settings → Generation → GPU Device

Update GPU Drivers

Outdated drivers can cause performance issues. Update to the latest NVIDIA drivers.

Apple Silicon: Confirm MLX Backend

Check Settings → Server Status. Should show Backend: MLX on Apple Silicon — MLX is 4–5× faster than PyTorch here. If it shows Backend: PYTORCH, reinstall MLX:

pip install -r backend/requirements-mlx.txt

GPU availability should read "Metal (Apple Silicon via MLX)".

High Memory Usage

Symptoms:

  • App uses excessive RAM
  • System becomes sluggish

Solutions:

  • Close unused voice profiles
  • Clear generation history
  • Restart the app periodically

Update Issues

"Update Check Failed"

Solutions:

  • Confirm your internet connection — updates are fetched from GitHub releases.
  • Ensure github.com is accessible and not blocked by a firewall or proxy.
  • As a fallback, download the latest release from GitHub and install manually.

"Invalid Signature" Error

Solutions:

  • Re-download the installer — the signature may have been corrupted in transit.
  • Verify the .sig file matches the installer; if it doesn't, file an issue.

Remote Mode Issues

Can't Connect to Remote Server

Symptoms:

  • "Connection refused" error
  • Remote server not found

Solutions:

Check Server Status

Ensure the remote server is running:

curl http://<server-ip>:17493/health

Check Firewall

Ensure port 17493 is open on the remote server:

# Allow port on Ubuntu/Debian
sudo ufw allow 17493

Verify Network

  • Ensure both machines are on the same network (for local servers)
  • Use IP address instead of hostname
  • Try pinging the server: ping <server-ip>

Still Having Issues?

If you're still experiencing problems:

  1. Check GitHub Issues: github.com/jamiepine/voicebox/issues
  2. Open a New Issue: Provide:
    • Operating system and version
    • Voicebox version
    • Steps to reproduce
    • Error messages or logs
  3. Join Discord: discord.gg/voicebox (coming soon)

Diagnostic Information

When reporting issues, include this information:

# Voicebox version
# Check Help → About in the app

# Operating system
uname -a  # macOS/Linux
systeminfo  # Windows

# Python version (for dev issues)
python --version

# GPU info (if generation issues)
nvidia-smi  # NVIDIA GPUs