Gemini 3.1 Flash Live
audioGoogle•Released on 2026-03-26
Google's highest-quality audio and voice model for real-time dialogue. Released March 26, 2026. Delivers natural rhythm and low latency for voice-first AI applications. Supports 70+ languages with SynthID audio watermarking.
77
Overall Score
Voice of the community
sample 20“Gemini 3.1 Flash TTS achieved an Elo score of 1,211 on Artificial Analysis TTS leaderboard, positioned in the 'most attractive quadrant' for quality + cost.”
Core Specs
0K
Context Window
0K
Max Output
ReasoningOpen Sourceaudiovoicetext
Scenario Scores
Pros & Cons
Sentiment70% +25% ·5% −
Pros
- +Real-time voice dialogue
- +Natural rhythm and intonation
- +Low latency audio processing
- +SynthID audio watermarking
- +Available in 200+ countries via Gemini Live
Cons
- −Preview only — API pricing TBA
- −Specialized for audio/voice (not text)
- −Limited to Google ecosystem for full features
Pricing
Input (per 1M tokens)$0.00
Output (per 1M tokens)$0.00
Free trial available
Updated on 2026-04-21
Get Started
1Visit the provider's website
2Create an account
3Start using the model
Benchmarks
complexFuncBenchAudio90.8%
audioMultiChallenge36.1%
noteComplexFuncBench-Audio: multi-step function calling with constraints. AudioMultiChallenge: Scale AI benchmark with thinking enabled.%
Reliability
Incidents (30d)0
View Status Page →