This guide covers common issues you might encounter when using or developing Voicebox, along with solutions.
Installation Issues
macOS: "App is damaged and can't be opened"
This occurs because the app isn't signed with an Apple Developer certificate.
Solution:
# Remove the quarantine attribute
xattr -cr /Applications/Voicebox.app
Windows: SmartScreen Warning
Windows SmartScreen may warn that the app is unrecognized.
Solution:
- Click "More info"
- Click "Run anyway"
Linux: AppImage Won't Run
Solution:
chmod +x voicebox-*.AppImage
./voicebox-*.AppImage
Server Issues
Backend Server Won't Start
Symptoms:
- Red status indicator in bottom-left corner
- "Failed to connect to server" error
Solutions:
Port Already in Use
Check if port 17493 is already in use:
# macOS/Linux
lsof -i :17493
Windows
powershell -Command "Get-NetTCPConnection -LocalPort 17493 -State Listen"
Kill the process using the port:
# macOS/Linux
kill -9 <pid>
Windows
taskkill /PID <pid> /F
Permission Issues
The server binary might not have execute permissions:
# macOS/Linux
chmod +x ~/Library/Application\ Support/sh.voicebox.app/backend/voicebox-server
Check Logs
View server logs for errors:
macOS:
tail -f ~/Library/Application\ Support/sh.voicebox.app/logs/server.log
Windows:
type %APPDATA%\sh.voicebox.app\logs\server.log
flash-attn is not installed Warning in Server Logs
Symptoms:
Warning: flash-attn is not installed. Will only run the manual PyTorch version.
Please install flash-attn for faster inference.
This is harmless. The warning is emitted by our transformer-based engines (Chatterbox / Qwen) on every startup. FlashAttention is an optional acceleration library — when it's not present, PyTorch's built-in scaled-dot-product attention (SDPA) runs instead, which is near-FA2 throughput on modern GPUs. Generation works normally.
Why it shows up on every platform:
- Windows:
flash-attnhas no official Windows support. The upstream project (Dao-AILab/flash-attention) still only says it might work, and source builds typically fail on recent CUDA/MSVC combinations. - macOS (Apple Silicon): FlashAttention is CUDA-only and doesn't apply here at all. MLX has its own optimized attention kernels.
- Linux: It's not pinned in our requirements because installing it is fragile and version-sensitive; users who want it install it themselves.
Solutions (all optional):
Ignore it (recommended)
PyTorch SDPA is what actually runs the model, and on Ampere/Ada/Hopper GPUs it's within a few percent of FA2 for our workloads. You won't notice a meaningful speed difference.
Install flash-attn on Linux
pip install flash-attn --no-build-isolation
Requires a matching CUDA toolkit. Build can take 20+ minutes.
Install flash-attn on Windows (community wheels)
Official builds don't exist, but community maintainers publish prebuilt wheels:
Pick the wheel matching your exact CUDA + PyTorch + Python combination. Example:
pip install https://github.com/kingbri1/flash-attention/releases/download/v2.8.3/flash_attn-2.8.3+cu128torch2.8.0cxx11abiFALSE-cp312-cp312-win_amd64.whl
Alternatively, run Voicebox's backend inside WSL2 and use the standard Linux wheels.
Connection Timeout
Symptoms:
- Long loading times
- "Connection timeout" errors
Solution:
- Restart the app
- Check your firewall settings
- Ensure localhost is accessible
Generation Issues
First Generation is Very Slow
Symptoms:
- First generation takes 2-5 minutes
- Progress indicator stuck at "Loading model..."
Explanation: This is expected behavior. The first generation downloads the selected TTS engine's model and initializes it. Sizes range from 350 MB (Kokoro) to 8 GB (TADA 3B).
Solution:
- Wait for the initial download to complete (progress is shown in Settings → Models)
- Subsequent generations reuse the cached model and are much faster
- Check your internet connection
- For low-bandwidth setups, start with Kokoro (
350 MB) or LuxTTS (300 MB)
Poor Voice Quality
Symptoms:
- Robotic or unnatural voice
- Missing emotion or prosody
- Pronunciation errors
Solutions:
Improve Voice Samples
- Use 10-30 seconds of clear audio
- Avoid background noise
- Ensure consistent speaking tone
- Add multiple samples from the same speaker
Match Speaking Style
The generated voice will mimic the tone and style of your samples. If your sample is monotone, the generation will be too.
Adjust Text Formatting
- Use proper punctuation
- Add commas for natural pauses
- Capitalize proper nouns
Generation Fails with "Out of Memory"
Symptoms:
- Generation crashes
- "CUDA out of memory" or "RuntimeError: out of memory"
Solutions:
Free GPU Memory
Close other GPU-intensive applications:
- Games
- Video editors
- Multiple browser tabs with WebGL
Then restart Voicebox.
Use CPU Mode
If your GPU doesn't have enough VRAM (need 6GB+), use CPU mode:
Settings → Generation → Use CPU instead of GPU
Reduce Batch Size
For long text, split it into smaller chunks instead of generating all at once.
MLX "Failed to load the default metallib" (Apple Silicon)
Symptoms:
- Generation fails with "library not found" or "metallib" errors
- Server logs reference missing Metal shader libraries
Solutions:
Rebuild the Server Binary
just build-server
The build script bundles MLX Metal shader libraries on Apple Silicon automatically.
Reinstall MLX Dependencies
pip install -r backend/requirements-mlx.txt
Verify Backend Detection
Check Settings → Server Status. Should show Backend: MLX on Apple Silicon. If it shows Backend: PYTORCH, MLX isn't installed correctly.
Audio Issues
No Audio Playback
Symptoms:
- Generated audio won't play
- Playback button doesn't respond
Solutions:
- Check system audio settings
- Ensure audio output device is connected
- Try exporting and playing in a media player
Crackling or Distorted Audio
Symptoms:
- Audio has static or distortion
- Clipping sounds
Solutions:
- Check if your input samples have distortion
- Reduce playback volume
- Re-generate with cleaner voice samples
Development Issues
Backend Won't Start in Dev Mode
Symptoms:
just dev-backendorjust devfails- Import errors or module not found
Solutions:
Python Version
Ensure Python 3.11 or higher:
python --version
If not, install Python 3.11+ and recreate the virtual environment.
Virtual Environment
Ensure venv is activated:
# macOS/Linux
source backend/venv/bin/activate
Windows
backend\venv\Scripts\activate
You should see (venv) in your prompt.
Dependencies
Reinstall dependencies — easiest via just:
just setup
Or manually:
cd backend
pip install -r requirements.txt
pip install --no-deps chatterbox-tts
pip install --no-deps hume-tada
pip install git+https://github.com/QwenLM/Qwen3-TTS.git
Tauri Build Fails
Symptoms:
bun run tauri buildfails- Rust compilation errors
Solutions:
# Clean build artifacts
cd tauri/src-tauri
cargo clean
# Update Rust
rustup update
# Try building again
cd ../..
bun run tauri build
OpenAPI Client Generation Fails
Symptoms:
./scripts/generate-api.shfails- "Failed to fetch schema" error
Solutions:
Ensure Backend is Running
curl http://localhost:17493/openapi.json
Should return JSON. If not, start the backend.
Check Port
Ensure nothing else is using port 17493
Regenerate Manually
cd backend
source venv/bin/activate
uvicorn main:app --reload --port 17493
In another terminal
./scripts/generate-api.sh
Database Issues
"Database is locked" Error
Symptoms:
- Profile or generation operations fail
- SQLite lock errors
Solutions:
- Close all Voicebox instances
- Delete the lock file:
# macOS
rm ~/Library/Application\ Support/sh.voicebox.app/data/voicebox.db-shm
rm ~/Library/Application\ Support/sh.voicebox.app/data/voicebox.db-wal
Corrupted Database
Symptoms:
- App crashes on launch
- Data missing or corrupted
Solutions:
# macOS
rm ~/Library/Application\ Support/sh.voicebox.app/data/voicebox.db
# Windows
del %APPDATA%\sh.voicebox.app\data\voicebox.db
Restart the app to create a fresh database.
Model Issues
Model Download Fails
Symptoms:
- "Failed to download model" error
- Stuck at "Downloading..."
Solutions:
- Check your internet connection
- Check HuggingFace Hub status
- Try using a VPN if HuggingFace is blocked in your region
- Manually download via the HuggingFace CLI and place in the cache directory:
pip install huggingface_hub
huggingface-cli download Qwen/Qwen3-TTS-12Hz-1.7B-Base
Wrong Model Version
Symptoms:
- Generation quality suddenly degraded
- Different voice output
Solutions:
Clear the model cache and re-download. Replace the Qwen* glob with the engine org prefix for other engines (ResembleAI* for Chatterbox, HumeAI* for TADA, hexgrad* for Kokoro, etc.) or use DELETE /models/{name} via the API.
# macOS / Linux
rm -rf ~/.cache/huggingface/hub/models--Qwen*
# Windows
rmdir /s %USERPROFILE%\.cache\huggingface\hub\models--Qwen*
Performance Issues
Slow Generation on GPU
Symptoms:
- Generation slower than expected
- GPU not being utilized
Solutions:
Verify CUDA Installation
nvidia-smi
Should show your GPU. If not, install CUDA drivers.
Check GPU Selection
If you have multiple GPUs, ensure Voicebox is using the right one.
Settings → Generation → GPU Device
Update GPU Drivers
Outdated drivers can cause performance issues. Update to the latest NVIDIA drivers.
Apple Silicon: Confirm MLX Backend
Check Settings → Server Status. Should show Backend: MLX on Apple Silicon — MLX is 4–5× faster than PyTorch here. If it shows Backend: PYTORCH, reinstall MLX:
pip install -r backend/requirements-mlx.txt
GPU availability should read "Metal (Apple Silicon via MLX)".
High Memory Usage
Symptoms:
- App uses excessive RAM
- System becomes sluggish
Solutions:
- Close unused voice profiles
- Clear generation history
- Restart the app periodically
Update Issues
"Update Check Failed"
Solutions:
- Confirm your internet connection — updates are fetched from GitHub releases.
- Ensure
github.comis accessible and not blocked by a firewall or proxy. - As a fallback, download the latest release from GitHub and install manually.
"Invalid Signature" Error
Solutions:
- Re-download the installer — the signature may have been corrupted in transit.
- Verify the
.sigfile matches the installer; if it doesn't, file an issue.
Remote Mode Issues
Can't Connect to Remote Server
Symptoms:
- "Connection refused" error
- Remote server not found
Solutions:
Check Server Status
Ensure the remote server is running:
curl http://<server-ip>:17493/health
Check Firewall
Ensure port 17493 is open on the remote server:
# Allow port on Ubuntu/Debian
sudo ufw allow 17493
Verify Network
- Ensure both machines are on the same network (for local servers)
- Use IP address instead of hostname
- Try pinging the server:
ping <server-ip>
Still Having Issues?
If you're still experiencing problems:
- Check GitHub Issues: github.com/jamiepine/voicebox/issues
- Open a New Issue: Provide:
- Operating system and version
- Voicebox version
- Steps to reproduce
- Error messages or logs
- Join Discord: discord.gg/voicebox (coming soon)
Diagnostic Information
When reporting issues, include this information:
# Voicebox version
# Check Help → About in the app
# Operating system
uname -a # macOS/Linux
systeminfo # Windows
# Python version (for dev issues)
python --version
# GPU info (if generation issues)
nvidia-smi # NVIDIA GPUs