Theme
Voice Channel
The voice channel enables inbound and outbound phone-based interactions powered by Telnyx. Users call a dedicated phone number and speak with your bot using natural language, DTMF input, or IVR menu navigation.
Prerequisites
- A Telnyx account with voice API credentials
- A provisioned phone number with voice capability
- Telnyx integration configured in Settings > Integrations
Configure Telnyx Integration
Go to Settings > Integrations > Add Integration and select Telnyx:
| Field | Description |
|---|---|
| API Key | Your Telnyx API v2 key |
| Connection ID | The Telnyx SIP connection for voice traffic |
| Public Key | Telnyx public key for webhook verification |
Provision a Phone Number
- Navigate to your bot's Channels tab
- Select Voice and click Configure
- Choose Provision New Number or Use Existing Number
- Select a number with voice capability (filter by area code or country)
- Click Assign to Bot
TIP
You can assign the same phone number for both SMS and Voice channels. OmniBots routes incoming traffic to the correct handler based on whether the inbound event is a call or a text message.
TTS and STT Configuration
OmniBots uses text-to-speech (TTS) and speech-to-text (STT) engines to convert between audio and text. Configure these in the voice channel settings:
| Setting | Options | Default |
|---|---|---|
| TTS Engine | Google Cloud TTS, Amazon Polly, Azure Cognitive Services | Google Cloud TTS |
| TTS Voice | Select from available voices per engine | en-US-Standard-C |
| TTS Speed | 0.5x -- 2.0x | 1.0x |
| STT Engine | Google Cloud STT, Deepgram, Azure Speech | Google Cloud STT |
| STT Language | Language code (e.g., en-US, es-MX) | en-US |
| Silence Timeout | Seconds of silence before processing | 3 seconds |
Toggle the voice channel to Active and OmniBots automatically configures the Telnyx webhook to route inbound calls to your bot.
imageVoice channel configuration page showing TTS engine dropdown, TTS voice selector, STT engine dropdown, STT language setting, and silence timeout slider
Voice-Specific Flow Nodes
IVR Menu Node
The IVR Menu node presents callers with spoken options and captures DTMF (keypad) input:
- Define menu options mapped to dial-pad keys (1--9, 0, *, #)
- Each key maps to a separate output handle in the flow
- A No Match handle catches unrecognized or timed-out input
- Configure retries (default: 3 attempts) before routing to the No Match path
imageIVR menu flow diagram showing a greeting prompt branching into DTMF key options (1-Billing, 2-Support, 3-Sales) with a No Match retry loop
Audio Prompts
Message nodes in voice flows are spoken aloud via TTS. You can also upload pre-recorded audio files (WAV or MP3) for consistent branding or regulatory disclosures.
TIP
Use the {{sys_channel}} variable in condition nodes to branch logic for voice vs text. Keep voice prompts concise -- callers cannot scan long text the way chat users can.
Call Recording
Enable call recording in the voice channel settings to capture full call audio for quality assurance and compliance.
| Setting | Description |
|---|---|
| Record Calls | Toggle on/off per bot |
| Recording Format | WAV or MP3 |
| Storage | Recordings are stored in your configured Cloud Storage bucket |
| Retention | Set retention period (30, 60, 90, 365 days, or indefinite) |
WARNING
Call recording may require caller consent depending on your jurisdiction. Many regions require two-party consent. Configure a consent prompt at the start of your voice flow to comply with local regulations.
Voice Authentication
Voice authentication allows callers to verify their identity using a voiceprint. Add a Voice Auth node to your flow to enable this feature:
- On first contact, the node prompts the caller to record three voice samples
- A voiceprint is generated from the samples using cosine similarity analysis
- On subsequent calls, the caller speaks a verification phrase and is matched against their stored voiceprint
- Anti-spoofing and liveness checks detect recorded or synthetic audio
- Failed verification attempts are logged and can trigger anomaly detection rules
Limitations
- Rich content (cards, carousels, forms) is converted to spoken descriptions with numbered options
- Maximum concurrent calls depend on your Telnyx connection capacity
- STT accuracy varies by language, accent, and background noise levels
