Development Setup

Quick Setup (Recommended)

Get started in two commands:

# Clone and enter the repository
git clone https://github.com/jamiepine/voicebox.git
cd voicebox

# Setup everything (Python venv, JS deps, dev sidecar)
just setup

# Start development (backend + desktop app)
just dev

The just dev command automatically starts the Python backend (if not already running) and launches the Tauri desktop app.

Prerequisites

Ensure you have these installed:

Bun

"> Download Bun

curl -fsSL https://bun.sh/install | bash

Python 3.11+

"> Download Python

python --version

Rust

"> Install Rust

rustc --version

Just

"> Install Just

brew install just  # macOS
cargo install just # Linux/Windows

Just works on macOS, Linux, and Windows.

Just Commands

Run just --list to see all available commands. Highlights:

Setup

Command	Description
`just setup`	Full setup (Python venv + JS deps + dev sidecar). Detects Apple Silicon for MLX and NVIDIA/Intel Arc on Windows for accelerated PyTorch.
`just setup-python`	Python venv + dependencies only
`just setup-js`	`bun install` only

Development

Command	Description
`just dev`	Start backend + Tauri desktop app (reuses a running backend if one exists)
`just dev-web`	Start backend + web app (no Tauri/Rust build)
`just dev-backend`	Backend only
`just dev-frontend`	Tauri app only (backend must already be running)
`just kill`	Stop all dev processes

Build

Command	Description
`just build`	CPU server binary + Tauri installer
`just build-local`	Windows: CPU + CUDA server binaries + Tauri installer
`just build-server`	CPU server binary only
`just build-server-cuda`	Windows: CUDA server binary only, placed in `%APPDATA%/sh.voicebox.app/backends/cuda` for local testing
`just build-tauri`	Tauri app only
`just build-web`	Web app only

Quality

Command	Description
`just check`	Lint + format + typecheck (Biome + ruff)
`just fix`	Auto-fix lint + format issues
`just lint` / `just format`	Lint or format only
`just test`	Run Python tests (pytest)
`just test-models`	End-to-end generation against every TTS engine using the frozen binary

Database

Command	Description
`just db-init`	Initialize SQLite database
`just db-reset`	Delete and reinitialize the database

Utilities

Command	Description
`just generate-api`	Generate TypeScript API client from the backend's OpenAPI schema
`just docs`	Open `http://localhost:17493/docs` in your browser
`just logs`	Tail backend logs
`just clean`	Remove build artifacts
`just clean-python`	Remove the Python venv + `__pycache__`
`just clean-all`	Nuclear clean (includes all `node_modules`)

Project Structure

Request Flow

HTTP request → routes/ (validate input) → services/ (business logic) → backends/ (TTS/STT inference) → utils/ (audio processing)

Key Modules

app.py — FastAPI app factory, CORS, lifecycle events
main.py — Entry point (imports app, runs uvicorn)
server.py — Tauri sidecar launcher, parent-pid watchdog
services/generation.py — Single function handling all generation modes
backends/ — TTS/STT engine implementations (MLX, PyTorch, etc.)

Model Downloads

Models are automatically downloaded from HuggingFace Hub on first use, with live progress streamed to the UI:

Whisper (transcription) — auto-downloads on first transcription
TTS engines — auto-download on first generation. Sizes range from 82 M (Kokoro, ~350 MB) to 3 B (TADA, ~8 GB)

See Model Management for the full list.

First-time usage will be slower due to model downloads, but subsequent runs will use cached models.

Generate OpenAPI Client

After starting the backend server, generate the TypeScript API client:

just generate-api

This downloads the OpenAPI schema and generates the TypeScript client in app/src/lib/api/.

Manual Setup (Advanced)

If you prefer not to use Just, follow these manual steps:

1. Install JavaScript Dependencies

bun install

This installs dependencies for:

app/ - Shared React frontend
tauri/ - Tauri desktop wrapper
web/ - Web deployment wrapper

2. Set Up Python Backend

cd backend

# Create virtual environment
python -m venv venv

# Activate virtual environment
source venv/bin/activate  # macOS/Linux
# or
venv\Scripts\activate  # Windows

# Install Python dependencies
pip install -r requirements.txt

# Apple Silicon: install MLX dependencies
pip install -r requirements-mlx.txt

# Chatterbox pins numpy<1.26 / torch==2.6 which break on Python 3.12+
pip install --no-deps chatterbox-tts

# HumeAI TADA pins torch>=2.7,<2.8 which conflicts with our torch>=2.1
pip install --no-deps hume-tada

# Install Qwen3-TTS from source
pip install git+https://github.com/QwenLM/Qwen3-TTS.git

# PyInstaller and linting tools
pip install pyinstaller ruff pytest pytest-asyncio

3. Start Development

Start the backend:

cd backend
source venv/bin/activate
uvicorn main:app --reload --port 17493

In a new terminal, start the desktop app:

cd tauri
bun run tauri dev

Next Steps

Architecture

Understand the system architecture

Contributing

Read the contribution guidelines

Building

Learn how to build production releases

TTS Engines

Add a new TTS engine end-to-end

Troubleshooting

Backend won't start

Check Python version (must be 3.11+)
Ensure virtual environment is activated: source backend/venv/bin/activate
Verify all dependencies are installed: pip install -r requirements.txt
Check if port 17493 is available

Tauri build fails

Ensure Rust is installed: rustc --version
Clean the build: cd tauri/src-tauri && cargo clean
Try rebuilding: just dev

OpenAPI client generation fails

Ensure backend is running: curl http://localhost:17493/openapi.json
Check network connectivity
Verify the backend is accessible at localhost:17493

See the full Troubleshooting Guide for more issues and solutions.