The effects pipeline provides professional-grade DSP audio processing using Spotify's Pedalboard library. Each generation can have multiple versions with different effect chains applied.
Overview
Key concepts:
- Effects Chain — JSON-serializable list of effect configurations applied sequentially
- Generation Version — A processed variant of a generation with its own audio file and effects chain
- Effect Preset — Saved effects chain configuration (built-in or user-created)
- Clean Version — The original unprocessed generation audio
Flow:
- TTS Generation creates clean audio
- Effects Chain processes the audio
- Processed Version is saved as a new generation version
Each generation maintains a clean version (original) plus any number of processed versions with different effect chains applied.
Effect Types
The following effect types are available, each with configurable parameters:
Chorus / Flanger
Modulated delay effect. Short centre_delay_ms gives flanger; longer gives chorus.
Parameters:
- rate_hz: LFO speed in Hz (range: 0.01 to 20, default: 1.0)
- depth: Modulation depth (range: 0.0 to 1.0, default: 0.5)
- feedback: Feedback amount (range: 0.0 to 0.95, default: 0.0)
- centre_delay_ms: Centre delay in milliseconds (range: 0.5 to 50, default: 7.0)
- mix: Wet/dry mix (range: 0.0 to 1.0, default: 0.5)
Reverb
Room reverb effect.
Parameters:
- room_size: Room size (range: 0.0 to 1.0, default: 0.5)
- damping: High frequency damping (range: 0.0 to 1.0, default: 0.5)
- wet_level: Wet level (range: 0.0 to 1.0, default: 0.33)
- dry_level: Dry level (range: 0.0 to 1.0, default: 0.4)
- width: Stereo width (range: 0.0 to 1.0, default: 1.0)
Delay
Echo / delay line.
Parameters:
- delay_seconds: Delay time in seconds (range: 0.01 to 2.0, default: 0.3)
- feedback: Feedback amount (range: 0.0 to 0.95, default: 0.3)
- mix: Wet/dry mix (range: 0.0 to 1.0, default: 0.3)
Compressor
Dynamic range compression for consistent loudness.
Parameters:
- threshold_db: Threshold in dB (range: -60 to 0, default: -20.0)
- ratio: Compression ratio (range: 1.0 to 20.0, default: 4.0)
- attack_ms: Attack time in ms (range: 0.1 to 100, default: 10.0)
- release_ms: Release time in ms (range: 10 to 1000, default: 100.0)
Gain
Volume adjustment in decibels.
Parameters:
- gain_db: Gain in dB (range: -40 to 40, default: 0.0)
High-Pass Filter
Removes frequencies below the cutoff.
Parameters:
- cutoff_frequency_hz: Cutoff frequency in Hz (range: 20 to 8000, default: 80.0)
Low-Pass Filter
Removes frequencies above the cutoff.
Parameters:
- cutoff_frequency_hz: Cutoff frequency in Hz (range: 200 to 20000, default: 8000.0)
Pitch Shift
Shift pitch up or down by semitones.
Parameters:
- semitones: Semitones to shift (range: -12 to 12, default: 0.0)
Generation Versions
Each generation starts with a clean version (no effects). Users can create processed versions by applying effect chains.
Version properties:
- id — Unique version identifier
- label — User-defined name (e.g., "robotic", "with reverb")
- audio_path — Path to the processed audio file
- effects_chain — JSON array of effect configurations
- source_version_id — Which version this was derived from
- is_default — Whether this is the default audio for the generation
File storage:
Default version behavior:
- One version per generation is marked as default
- The generation's audio_path always points to the default version's audio
- Deleting the default version automatically promotes another version
Effect Presets
Presets are saved effects chains that can be reused across generations.
Built-in presets:
- Robotic: Metallic robotic voice using chorus (flanger-style)
- Radio: Thin AM-radio voice with band-pass filtering and light compression
- Echo Chamber: Spacious reverb with trailing echo
- Deep Voice: Lower pitch with added warmth using pitch shift and compression
User presets:
- Created via the effects UI
- Stored in the database (SQLite)
- Cannot modify/delete built-in presets
- Used to quickly apply favorite effect combinations
API Endpoints
Effects Management
| Endpoint | Method | Description |
|---|---|---|
| /effects/available | GET | List all effect types with parameter definitions |
| /effects/presets | GET | List all presets (built-in + user) |
| /effects/presets | POST | Create a new user preset |
| /effects/presets/:id | GET | Get a specific preset |
| /effects/presets/:id | PUT | Update a user preset |
| /effects/presets/:id | DELETE | Delete a user preset |
| /effects/preview/:generation_id | POST | Preview effects on a generation (returns audio stream) |
Generation Versions
| Endpoint | Method | Description |
|---|---|---|
| /generations/:id/versions | GET | List all versions for a generation |
| /generations/:id/versions/apply-effects | POST | Apply effects chain, create new version |
| /generations/:id/versions/:version_id/set-default | PUT | Set a version as default |
| /generations/:id/versions/:version_id | DELETE | Delete a version |
Request Body: Apply Effects
Request body for applying effects:
- effects_chain: Array of effect objects
- label: Version label (e.g., "with reverb")
- set_as_default: Whether to set as default
- source_version_id: Source version ID (optional)
Implementation
Backend Architecture
Files:
| File | Purpose |
|---|---|
| backend/utils/effects.py | Effect registry, validation, and audio processing |
| backend/services/versions.py | Generation version CRUD operations |
| backend/services/effects.py | Effect preset CRUD operations |
| backend/routes/effects.py | API endpoints for effects and versions |
Effect Registry:
The EFFECT_REGISTRY dict in utils/effects.py defines all available effects with their parameters, defaults, and ranges.
Validation:
Effects chains are validated before application:
- Each effect type must exist in the registry
- Parameters must be numbers within min/max bounds
- Unknown parameters are rejected
Audio Processing:
Uses Spotify's Pedalboard library:
from pedalboard import Pedalboard
# Build pedalboard from chain
board = build_pedalboard(effects_chain)
# Apply to audio (async via thread)
processed = await asyncio.to_thread(lambda: board(audio, sample_rate))
Frontend Integration
Key components:
| Component | Location |
|---|---|
| Effects chain editor | app/src/components/Effects/ |
| Version selector | Generation detail view |
| Preset manager | Effects panel |
| Live preview | Preview button (streams processed audio) |
State management:
- Effects chains are stored as JSON arrays
- Live preview fetches processed audio without saving
- Applied effects create new versions via POST endpoint
Adding New Effects
To add a new effect type:
Add to registry (backend/utils/effects.py):
- Add entry to EFFECT_REGISTRY with cls, label, description, and params
- Import the effect class from Pedalboard
Update frontend types if needed
The new effect automatically appears in /effects/available and the chain editor UI.
Best Practices
Effect ordering matters. Process effects in this order for best results:
- Pitch shift (if needed)
- High/low-pass filters
- Chorus/flanger (time-based)
- Reverb/delay (spatial)
- Compressor
- Gain (final level adjustment)
CPU usage:
- Effects are applied in real-time during generation
- Pitch shift and reverb are the most CPU-intensive
- Consider previewing complex chains before applying
Storage:
- Each version creates a new audio file
- Clean version always exists (can be reverted to)
- Processed versions can be deleted to save space