Local Testing Guide¶
This guide explains how to test GhostBrain locally using your computer's microphone, without setting up Twilio phone numbers or webhooks.
Overview¶
GhostBrain provides two methods for local testing:
- PyAudio Method (Recommended) - Direct microphone access
- Daily Method - WebRTC-based with shareable rooms
Both methods use the same AI pipeline as production but bypass Twilio's telephony layer.
Prerequisites¶
Required API Keys¶
You'll need API keys from these services (all offer free tiers):
- Deepgram - For speech-to-text
- Sign up at: https://console.deepgram.com/signup
- Get API key from: Dashboard โ API Keys
-
Free tier: $200 credit
-
Groq - For LLM inference
- Sign up at: https://console.groq.com/
- Get API key from: API Keys section
-
Free tier: Generous rate limits
-
OpenAI - For text-to-speech
- Sign up at: https://platform.openai.com/
- Get API key from: API Keys section
- Free tier: $5 credit for new accounts
Environment Setup¶
Create a .env file in the project root:
# Required API Keys
GHOST_BRAIN_DEEPGRAM_API_KEY=your_deepgram_key_here
GHOST_BRAIN_GROQ_API_KEY=your_groq_key_here
GHOST_BRAIN_OPENAI_API_KEY=your_openai_key_here
# Optional: For Daily method
DAILY_API_TOKEN=your_daily_token_here
# Optional: Twilio (not needed for local testing)
GHOST_BRAIN_TWILIO_ACCOUNT_SID=optional
GHOST_BRAIN_TWILIO_AUTH_TOKEN=optional
Method 1: PyAudio (Recommended)¶
Installation¶
- Install system dependencies:
macOS:
Ubuntu/Debian:
Windows:
- Install Python dependencies:
Running the Test¶
-
Ensure your microphone is connected and working
-
Run the test script:
-
You should see:
============================================================ ๐ค Ghost Brain Local Microphone Test ============================================================ Instructions: โข Speak clearly into your microphone โข Wait for the bot to finish speaking before responding โข Press Ctrl+C to stop and save the transcript Starting... Say hello to begin! -
Start talking! The bot will:
- Transcribe your speech in real-time
- Generate intelligent responses
-
Speak back to you through your speakers
-
End the session: Press
Ctrl+Cto stop. The transcript will be: - Saved to
local_mic_transcript.txt - Printed to your console
Troubleshooting PyAudio¶
"Microphone not found" errors:
# List available audio devices
python -c "import pyaudio; p = pyaudio.PyAudio(); print([p.get_device_info_by_index(i) for i in range(p.get_device_count())])"
Permission errors on macOS: - Go to System Settings โ Privacy & Security โ Microphone - Ensure Terminal/IDE has microphone access
Audio feedback/echo: - Use headphones instead of speakers - Reduce speaker volume - Increase VAD threshold in the code
Method 2: Daily WebRTC¶
Overview¶
Daily provides WebRTC-based audio transport that works in browsers. This method is useful for: - Testing with remote team members - Browser-based testing - When PyAudio has compatibility issues
Installation¶
No additional installation needed - Daily is included in the Pipecat dependencies.
Running the Test¶
- Optional: Set Daily API token (for room creation):
Without a token, you can still join existing rooms.
-
Run the test script:
-
The script will output a room URL:
-
Open the URL in a web browser and allow microphone access
-
Start talking through your browser
Joining an Existing Room¶
To join a specific Daily room:
DAILY_ROOM_URL=https://your-domain.daily.co/existing-room hatch run python -m ghost_brain.local_test
How It Works¶
Audio Flow¶
flowchart TD
M["๐ค Your Microphone"] --> C["PyAudio/Daily Capture<br/>(16kHz PCM)"]
C --> P{"Pipecat Pipeline"}
P -->|"speech โ text"| STT["Deepgram STT"]
STT -->|"text โ response"| LLM["Groq LLM"]
LLM -->|"response โ speech"| TTS["OpenAI TTS"]
TTS --> S["๐ Your Speakers"]
Key Differences from Production¶
| Component | Production (Twilio) | Local Testing |
|---|---|---|
| Audio Input | Phone call (8kHz) | Microphone (16kHz) |
| Transport | WebSocket | PyAudio/Daily |
| Latency | ~500-850ms | ~300-500ms |
| Audio Quality | Telephony grade | Full quality |
| Accessibility | Phone number | Local only |
Voice Pipeline Components¶
- Voice Activity Detection (VAD)
- Silero VAD model
- Detects speech vs. silence
- Prevents interruptions
-
Configurable sensitivity
-
Speech-to-Text (STT)
- Deepgram Nova-2 model
- Real-time streaming
- High accuracy
-
Automatic punctuation
-
Language Model (LLM)
- Llama 3.1 70B via Groq
- Conversational context
- Interview-style responses
-
Customizable personality
-
Text-to-Speech (TTS)
- OpenAI TTS-1 model
- "Alloy" voice
- Natural intonation
- Low latency
Customization¶
Changing the System Prompt¶
Edit local_mic_test.py or local_test.py:
llm = GroqLLMService(
api_key=self.settings.groq_api_key,
model="llama-3.3-70b-versatile",
system_instruction=(
"You are a helpful assistant specializing in technical interviews. "
"Ask probing questions about the candidate's experience. "
"Keep responses concise and professional."
),
)
Adjusting Audio Settings¶
VAD Sensitivity:
vad_analyzer = SileroVADAnalyzer(
sample_rate=16000,
params=VADParams(
stop_secs=0.5, # How long to wait after speech stops
min_volume=0.1, # Minimum volume threshold
),
)
Audio Quality:
# In LocalAudioTransport.__init__
self.sample_rate = 16000 # Can increase to 24000 or 48000
self.chunk_size = 1024 # Larger = more latency, smaller = more CPU
Using Different Models¶
Different LLM:
# Example: Use Mixtral instead
llm = GroqLLMService(
api_key=self.settings.groq_api_key,
model="mixtral-8x7b-32768", # Faster, smaller model
# ... rest of config
)
Different Voice:
# OpenAI voices: alloy, echo, fable, onyx, nova, shimmer
tts = OpenAITTSService(
api_key=self.settings.openai_api_key,
voice="nova", # Different voice
model="tts-1-hd", # Higher quality version
)
Performance Tips¶
Reducing Latency¶
- Use wired internet instead of WiFi
- Close unnecessary applications to free CPU
- Use a good quality microphone to reduce STT errors
- Adjust VAD settings to be less conservative:
Improving Accuracy¶
- Speak clearly and at normal pace
- Minimize background noise
- Use headphones to prevent echo
- Position microphone correctly (6-12 inches from mouth)
Common Issues¶
"No module named 'pyaudio'"¶
"API key not found"¶
Ensure your .env file has all required keys:
"Connection refused" or timeout errors¶
Check your firewall isn't blocking:
- Deepgram API: api.deepgram.com:443
- Groq API: api.groq.com:443
- OpenAI API: api.openai.com:443
High CPU usage¶
- Reduce sample rate to 8000 Hz
- Increase chunk size to 2048
- Check for other CPU-intensive processes
Audio drops or stuttering¶
- Check your internet connection stability
- Try reducing concurrent applications
- Ensure audio drivers are up to date
Testing Scenarios¶
Basic Conversation Test¶
- Start the bot
- Say: "Hello, can you hear me?"
- Wait for response
- Have a brief conversation
- Check transcript for accuracy
Interruption Handling¶
- Start speaking
- While bot is responding, try to interrupt
- VAD should handle this gracefully
Long Form Response¶
- Ask a complex question requiring detailed answer
- Verify bot can speak for extended periods
- Check audio doesn't cut off mid-sentence
Silence Handling¶
- Start the bot
- Stay silent for 10+ seconds
- Bot should wait patiently
- Resume speaking - bot should respond normally
Next Steps¶
After successful local testing:
- Deploy to Cloud Run - See Self-Hosting Setup Guide
- Set up Twilio - Configure phone numbers and webhooks
- Add monitoring - Set up logging and alerts
- Customize personality - Adjust system prompts for your use case
- Implement analytics - Track conversation metrics