Skip to content
Google

Gemini 3.1 Flash Live

audio

GoogleReleased on 2026-03-26

Google's highest-quality audio and voice model for real-time dialogue. Released March 26, 2026. Delivers natural rhythm and low latency for voice-first AI applications. Supports 70+ languages with SynthID audio watermarking.

77
Overall Score

Voice of the community

sample 20

Gemini 3.1 Flash TTS achieved an Elo score of 1,211 on Artificial Analysis TTS leaderboard, positioned in the 'most attractive quadrant' for quality + cost.

Artificial Analysis2026-04-15

Core Specs

0K
Context Window
0K
Max Output
ReasoningOpen Sourceaudiovoicetext

Pros & Cons

Sentiment70% +25% ·5% −

Pros

  • +Real-time voice dialogue
  • +Natural rhythm and intonation
  • +Low latency audio processing
  • +SynthID audio watermarking
  • +Available in 200+ countries via Gemini Live

Cons

  • Preview only — API pricing TBA
  • Specialized for audio/voice (not text)
  • Limited to Google ecosystem for full features

Pricing

Input (per 1M tokens)$0.00
Output (per 1M tokens)$0.00
Free trial available
Updated on 2026-04-21

Get Started

1Visit the provider's website
2Create an account
3Start using the model

Benchmarks

complexFuncBenchAudio90.8%
audioMultiChallenge36.1%
noteComplexFuncBench-Audio: multi-step function calling with constraints. AudioMultiChallenge: Scale AI benchmark with thinking enabled.%

Reliability

Incidents (30d)0
View Status Page →