Switching the Text-to-Speech (TTS)¶
GhostBrain uses OpenAI's tts-1 model by default. If you want hyper-realistic voices (like ElevenLabs) or extremely low latency models (like Cartesia), you can easily swap the provider.
Step 1: Update Dependencies¶
Install the necessary Pipecat extra. For ElevenLabs:
Step 2: Add the API Key¶
-
In
.env: -
In
src/ghost_brain/config.py:
Step 3: Modify the TTS Factory¶
Open src/ghost_brain/services/tts.py. Swap OpenAITTSService for ElevenLabsTTSService.
# from pipecat.services.openai import OpenAITTSService
from pipecat.services.elevenlabs import ElevenLabsTTSService
from ghost_brain.config import Settings
def create_tts(settings: Settings) -> ElevenLabsTTSService:
"""
Create ElevenLabs TTS service.
"""
return ElevenLabsTTSService(
api_key=settings.elevenlabs_api_key,
voice_id=settings.elevenlabs_voice_id,
# Elevenlabs requires passing the exact expected output sample rate
# Make sure this matches your pipeline configuration!
output_format="pcm_16000"
)
The rest of the Pipecat pipeline will automatically consume the audio frames generated by ElevenLabs and stream them directly back to the user via Twilio or your local microphone.