Quincer AI Docs Home Support Start free

Configure

Voice chat

Visitors can switch to a spoken conversation with your agent at any point. A single tap on the headphone chip starts a real-time voice call through xAI’s Grok voices. Interruption, transcript capture, and graceful fallbacks to text are all handled for you.

Turn it on

  1. Open Settings → Voice in the dashboard.
  2. Pick your provider (xAI Grok, OpenAI Realtime, or Google Gemini Live) and how to authenticate. The simplest option is Reuse my text-chat key — if text chat already works with that provider, voice just works too. Otherwise paste a voice-enabled key (console.x.ai for xAI, platform.openai.com for OpenAI, aistudio.google.com for Gemini).
  3. Pick a silence timeout (default 10s) and max call length (default 15 min).
  4. Open Billing → Voice and pick a plan (see below). Metered customers get minute bundles; BYOK flat gets unlimited minutes for a flat $49/mo platform fee.
  5. In Customize & Deploy → Appearance, pick a voice UI layout and flip the Enable voice chat switch for each widget you want the chip to appear on. Voice stays hidden on widgets where the toggle is off.
  6. Click Try voice live on the same page to test the selected layout and voice end-to-end in an embedded preview before visitors ever see it.
i

You don’t need your own xAI key to try voice — Quincer AI ships a platform-default key that works out of the box on every plan. Pasting your own only matters if you want to bill xAI directly (BYOK) or use a region-specific key.

Pricing

Two billing models, both requiring you to bring your own xAI API key:

Model Minutes Price Best for
Voice 60 Monthly 60/mo $26/mo Trying voice on a single widget
Voice 180 Monthly 180/mo $59/mo Moderate use across a couple of widgets
Voice 960 Monthly 960/mo $249/mo High-volume support / lead teams
Voice 60 / 180 / 960 Add-on one-time $29 / $69 / $299 Top up a busy period without changing your plan
Voice BYOK (unlimited) Unlimited $49/mo You pay xAI directly for usage; we charge a flat platform fee

Monthly plans are ~10–17% cheaper than the equivalent one-time add-on.

UI layouts

Voice is visually distinct from text, so Quincer AI offers four layouts for how the call is presented inside the widget. Pick one under Customize & Deploy → Appearance → Voice layout. A live preview shows exactly how your chosen layout looks before you save.

LayoutWhat it looks likeBest for
Hybrid default A compact voice strip slides in above the chat transcript. Transcripts for both sides render inline in the message list as italic bubbles with a mic glyph, so visitors can scroll back through what was said. Most widgets. Works on desktop and mobile; auto-promotes to takeover under 480px.
Takeover The whole chat panel swaps to an orb-centric voice UI while the call is live. Text history is hidden until the call ends. Maximum focus. Support flows where voice is the primary channel, or mobile-first deployments.
Inline No dedicated strip or orb — transcripts simply appear in the normal message list while a mic-state dot in the header indicates listening / thinking / speaking. Minimal UIs that want voice to feel like dictation rather than a “call”.
Strip A slim always-visible control strip at the top of the chat when a call is active. Transcripts render in the message list like Inline, but with richer on-strip controls. Teams that want the call state obvious without taking over the whole panel.
i

On screens narrower than 480px, Hybrid and Strip automatically promote to Takeover — there’s not enough room to show a transcript and voice controls side-by-side. The dashboard preview has a Desktop/Mobile toggle so you can see both states.

How it works

Live voice takeover

When an AI voice call needs a human, any licensed teammate can take it over mid-conversation. Visitor voice stays on Quincer AI’s relay while the AI is talking; the moment an agent clicks Take Over on the Live Conversations page, the widget bridges the visitor onto Cloudflare’s WebRTC SFU and the agent joins the same call. No app install for the visitor, no phone number, no third-party dial-in.

  1. On Live Conversations the active voice call shows a Voice call badge. Open the conversation.
  2. Click Take Over. If you have a live-voice seat (see below), you’re redirected to the call console; if not, you get a banner suggesting the visitor be asked to switch to text.
  3. The AI speaks a short hand-off line (“Stand by — Mo is joining the call now”) while your console auto-requests mic access.
  4. Within a few seconds, your console and the visitor’s widget are bridged directly via Cloudflare Realtime. The AI steps aside. You and the visitor hear each other in real time.
  5. Either side clicking Hang up (or the visitor clicking End call) ends the bridge and closes the call for the other party.
i

If the WebRTC bridge fails (for example, firewalled networks), the AI stays on the call and tells the visitor: “I’m getting an error transferring you — let me stay on and help out, or switch to text and my colleague will chat with you there.” The session continues; no dropped call.

Seats & licensing

Live voice takeover is a per-seat SKU at $19/user/month. Assign seats on TeamVoice seat toggle for each teammate. Super-admins can also grant seats directly from AdminUsers for free while you’re setting up. Visitor voice minutes continue to bill from the org’s voice plan; the $19 seat covers only the human handover side.

Voice providers

Quincer AI ships with three production voice providers — xAI Grok, OpenAI Realtime, and Google Gemini Live. Pick a default provider under Settings → Voice → Default voice provider; individual personas can select any voice from any enabled provider. All three run over the same relay pipeline, so everything else (layouts, billing, transcripts, knowledge grounding, integration tools) behaves identically.

ProviderModelGood for
xAI Grok grok-voice-think-fast-1.0 default ยท grok-voice-fast-1.0 Fast, conversational, five curated voices. Grok Think Fast is the flagship — background reasoning, stronger tool-calling, and 20+ languages including Arabic (Egypt, Saudi Arabia, UAE), Bengali, Hindi, Korean, Turkish, Vietnamese, and more. Legacy Grok Fast is still selectable in Settings → Voice while xAI completes its migration.
OpenAI Realtime gpt-realtime Ten voices including the expressive marin and cedar. Strongest when you need nuanced delivery.
Google Gemini Live gemini-3.1-flash-live Eight voices (Aoede, Charon, Fenrir, Kore, Leda, Orus, Puck, Zephyr). 70+ languages out of the box, strong multilingual delivery.

Voices

Each persona picks one voice from a dropdown under Personas → [Persona] → Voice. A Preview button next to the dropdown plays a short sample so you can compare tone before saving. The dropdown groups voices by provider.

xAI Grok

VoiceTone
Eve xAI defaultWarm, balanced female voice. Good all-rounder.
AraSofter female, calmer cadence.
LeoFriendly male, conversational.
RexDeeper male, more confident delivery.
SalNeutral, professional — works well for support.

OpenAI Realtime

VoiceTone
alloy OpenAI defaultBalanced, versatile. OpenAI’s neutral-tone pick.
marin recommendedMost natural delivery. Recommended for nuanced conversations.
cedar recommendedMost natural delivery. Slightly richer than marin.
ashExpressive, natural male.
balladWarm, storytelling.
coralFriendly, upbeat female.
echoClear, confident male.
sageCalm, thoughtful.
shimmerBright, energetic female.
verseSmooth, musical.

Google Gemini Live

VoiceTone
Aoede Gemini defaultBreezy, conversational.
CharonInformative, measured.
FenrirExcitable, energetic.
KoreFirm, assertive.
LedaYouthful, bright.
OrusFirm, deeper register.
PuckUpbeat, playful.
ZephyrBright, clear.

Leaving the dropdown on Provider default falls back to your org’s default provider’s default voice (Eve for xAI, alloy for OpenAI, Aoede for Gemini).

Troubleshooting