N
Nemotron 3 Ultra
Open SourceNVIDIA•Released on 2026-06-04
NVIDIA's largest open-weights frontier model, with 550B total parameters and 55B active (MoE) on a hybrid Mamba-Transformer architecture. Announced at Computex 2026 and released June 4. Supports a 1M-token context and leads US open-weights models on the Artificial Analysis Intelligence Index (48), though it trails Chinese frontier models like Kimi K2.6. Optimized for high-throughput reasoning and agent orchestration, serving 300+ tokens/sec.
82
Overall Score
Core Specs
1000K
Context Window
40K
Max Output
ReasoningOpen Sourcetext
Scenario Scores
Pros & Cons
Sentiment0% +0% ·0% −
Pros
- +Highest-scoring US open-weights model on AA Intelligence Index (48)
- +1M-token context window for long agentic workflows
- +Very high throughput (300+ tokens/sec, multiples faster than GLM-5.1 / Kimi K2.6)
- +Open weights under permissive NVIDIA Open Model License
- +Competitive open pricing ($0.50 / $2.50 per MTok)
Cons
- −Trails Chinese frontier open models (e.g. Kimi K2.6 at 54)
- −Text-only (no multimodal support)
- −550B weights demand high-end hardware to self-host
- −New release, limited independent community feedback
Pricing
Input (per 1M tokens)$0.50
Output (per 1M tokens)$2.50
Free trial available
Updated on 2026-06-05
Get Started
1Visit the provider's website
2Create an account
3Start using the model
Benchmarks
artificialAnalysisIntelligenceIndex48%
Reliability
Incidents (30d)0
Open-source, reliability depends on hosting provider