Voice
Voice calling with GNETiX
GNETiX supports outbound voice calls, allowing users to transition from a chat conversation to a phone call while preserving full conversation context. The voice system is powered by LiveKit for real-time audio, ElevenLabs for speech-to-text and text-to-speech, and SIP trunking via Twilio for PSTN connectivity.
How It Works
- A user types "call me" (or similar) in any chat platform
- The Director recognizes the intent and initiates a voice call
- GNETiX creates a LiveKit room with an AI agent participant
- The call is placed to the user's registered phone number via SIP/Twilio
- The user picks up and speaks naturally with the AI agent
- The full chat conversation context carries over into the voice session
Conversation context flows seamlessly from chat to voice. If a user was troubleshooting a network issue in Webex and says "call me," the voice agent already knows the full context of the conversation.
Architecture
User's Phone <--SIP/PSTN--> Twilio SIP Trunk <--SIP--> LiveKit Server
|
LiveKit Agent
| |
ElevenLabs Director
(STT + TTS) (LLM)| Component | Role |
|---|---|
| LiveKit Cloud | Real-time audio infrastructure, room management, SIP bridge |
| ElevenLabs Scribe v2 | Speech-to-text (STT) -- transcribes user speech |
| ElevenLabs TTS | Text-to-speech -- generates natural voice responses |
| Twilio SIP Trunk | PSTN connectivity -- routes calls to/from phone numbers |
| Director | LLM orchestration -- same Director that handles chat messages |
Requirements
To enable voice calling, you need:
- LiveKit Cloud account with a project and API key/secret
- ElevenLabs API key with access to the Conversational AI or standard TTS/STT APIs
- Twilio account with a SIP trunk and phone number configured
- User phone numbers registered in GNETiX (configured per user in the portal)
Configuration
Voice settings are configured through environment variables on the backend:
# LiveKit
LIVEKIT_URL=wss://your-project.livekit.cloud
LIVEKIT_API_KEY=APIxxxxxxx
LIVEKIT_API_SECRET=xxxxxxxxxxxxxxxx
# ElevenLabs
ELEVENLABS_API_KEY=sk_xxxxxxxxxxxxxxxx
# Voice API authentication
VOICE_API_KEY=<random-secret-for-voice-endpoints>SIP trunk configuration (Twilio dispatch rules, inbound/outbound trunk setup) is managed in the LiveKit dashboard.
User Phone Numbers
Each user who wants to receive voice calls must have a phone number registered in GNETiX. Admins can set this in the user profile, or users can configure it themselves (if permitted).
Phone numbers should be in E.164 format (e.g., +15551234567).
Limitations
- Voice calls are outbound only -- the system calls the user, not the other way around
- Requires PSTN connectivity through a SIP trunk provider
- Voice quality depends on the user's phone connection and the SIP trunk provider
- Tool calls during voice sessions use the same MCP infrastructure as chat