Installation

Download

Voicebox is available for macOS and Windows, with Linux builds coming soon.

macOS

"> Download for Apple Silicon or Intel Macs

Windows

"> Download MSI installer or Setup executable

macOS

Tab

Download: voicebox_aarch64.app.tar.gz

# Extract the archive
tar -xzf voicebox_aarch64.app.tar.gz

Move to Applications

mv Voicebox.app /Applications/

Tab

Download: voicebox_x64.app.tar.gz

# Extract the archive
tar -xzf voicebox_x64.app.tar.gz

Move to Applications

mv Voicebox.app /Applications/

Windows

Tab

Download: voicebox_x64_en-US.msi

Double-click the MSI file and follow the installation wizard.

Tab

Download: voicebox_x64-setup.exe

Run the executable and follow the installation wizard.

Linux

Linux builds are coming soon. Currently blocked by GitHub runner disk space limitations.

First Launch

When you launch Voicebox for the first time:

  1. Model Download — The TTS engine you generate with first will download its model automatically. Sizes range from 350 MB (Kokoro) to ~8 GB (TADA 3B). Most users start with Qwen 1.7B (3.5 GB).

  2. Data Directory — Voice profiles and generated audio are stored in:

    • macOS: ~/Library/Application Support/sh.voicebox.app/
    • Windows: %APPDATA%/sh.voicebox.app/
    • Linux: ~/.config/sh.voicebox.app/
  3. Backend Server — The bundled Python server starts automatically

First generation will be slower due to model downloads. Subsequent runs use cached models.

System Requirements

Minimum

  • OS: macOS 11+, Windows 10+, or Linux
  • RAM: 8GB
  • Storage: 5GB free space (for models and data)
  • CPU: Modern multi-core processor

Recommended

  • RAM: 16GB+
  • GPU: CUDA-capable NVIDIA GPU (for faster generation)
  • Storage: 10GB+ free space
CPU inference is supported but significantly slower than GPU. A CUDA-capable GPU is highly recommended for real-time workflows.

Verification

After installation, verify everything works:

  1. Launch Voicebox
  2. Check the server status indicator in the bottom-left corner (should be green)
  3. Navigate to Profiles and create a test profile
  4. Generate a short audio clip to verify the TTS engine works
If you see a green status indicator and can generate audio, you're all set!

Next Steps