DeepSeek V4
Open SourceDeepSeek•Released on 2026-04-24
DeepSeek V4 (released 2026-04-24) ships two MIT-licensed MoE variants: V4-Pro (1.6T/49B active) and V4-Flash (284B/13B active), both with 1M-token context and hybrid Compressed Sparse Attention + Heavily Compressed Attention. Three reasoning modes (Non-think / Think High / Think Max). V4-Pro uses only 27% of V3.2's FLOPs and 10% of its KV cache at 1M context. Priced well below GPT-5.5 / Opus 4.7 while matching them on most benchmarks.
85
Overall Score
Voice of the community
“The pricing is the most notable aspect — these represent a very, very inexpensive model compared to frontier alternatives.”
“V4-Pro sets new SOTA for open models on SimpleQA-Verified at 57.9%, a 20-point jump over the best previous open model.”
Core Specs
1000K
Context Window
66K
Max Output
ReasoningOpen Sourcetext
Scenario Scores
Pros & Cons
Sentiment75% +20% ·5% −
Pros
- +1M token context window with aggressive KV-cache compression
- +MIT license — fully open-source, self-hostable
- +V4-Pro $1.74/$3.48 per MTok — far cheaper than GPT-5.5 and Opus 4.7
- +New SOTA for open models on SimpleQA-Verified (57.9)
- +OpenAI + Anthropic API-compatible endpoints
- +Three reasoning modes tunable per request
Cons
- −Still trails GPT-5.4 / Gemini 3.1 Pro by 3-6 months on frontier benchmarks
- −Servers in China (overseas latency, geopolitical concerns)
- −Text-only — V3's multimodal (image/video) capability not confirmed for V4
- −V4-Pro self-hosting needs substantial hardware (49B active × FP4/FP8)
Pricing
Input (per 1M tokens)$1.74
Output (per 1M tokens)$3.48
Free trial available
Updated on 2026-04-24
Get Started
1Visit the provider's website
2Create an account
3Start using the model
Benchmarks
sweBench80.6%
mmlu87.5%
gpqaDiamond90.1%
humanEval76.8%
math64.5%
simpleQA57.9%
hle48.2%
gsm8k92.6%
longBenchV251.5%
noteV4-Pro scores. SimpleQA-Verified 57.9 is a new SOTA for open models (+20pt over prior best). Simon Willison: 'falls marginally short of GPT-5.4 and Gemini-3.1-Pro, trailing frontier by 3-6 months.'%
Reliability
SLA99.0%
Incidents (30d)1
Last Incident2026-03-30
Self-hosting available. API servers in China.
View Status Page →