Voice AI Leaders Dinner
An intimate dinner for founders and technical leaders building voice-powered products. Learn why 2026 is poised to be the year of speech-native models, and how these models are one of the keys to building truly agentic voice AI experiences.
When
April 9, 2026
5:30pm – 8:30 PM
Where
Prospect
300 Spear Street
San Francisco, CA 94105
Format
8–12 Guests
Curated · Off the record
What to Expect
- 5:30 — Arrival and networking
- 6:00 — The thesis: Why speech-native voice AI changes reliability and UX at scale. What the shift from the component model to speech-to-speech means for production systems.
- 6:15 — Interactive demo: Bring your edge cases. See how a speech-native approach can shift how you approach the latency, interruption, and context challenges you're solving today.
- 6:35 — Open discussion: What's working in your voice stack. What's breaking. Requirements and blockers you're facing. No sales pressure — peer conversation.
- 7:45 — Next steps: For those interested, we'll walk through a pilot framework and book follow-up meetings for attendees who want more information.
The problem we're solving
Most voice AI stacks are stitched together — separate speech-to-text, LLM, and text-to-speech models chained in a pipeline. Every hop adds latency. At production scale, you're fighting 1000ms+ response times, lost conversational context, and a user experience that breaks the moment a human speaker goes off-script.
Ultravox takes a different approach. A single multimodal model that understands speech directly — no transcription step, no pipeline. The result is voice AI infrastructure that delivers natural conversational experiences, even at scale.
<800ms
End-to-End Latency
Native
Speech Understanding
1 Platform
Not a Pipeline