Skip to content
N

Nemotron 3 Ultra

Open Source

NVIDIAReleased on 2026-06-04

NVIDIA's largest open-weights frontier model, with 550B total parameters and 55B active (MoE) on a hybrid Mamba-Transformer architecture. Announced at Computex 2026 and released June 4. Supports a 1M-token context and leads US open-weights models on the Artificial Analysis Intelligence Index (48), though it trails Chinese frontier models like Kimi K2.6. Optimized for high-throughput reasoning and agent orchestration, serving 300+ tokens/sec.

82
Overall Score

Core Specs

1000K
Context Window
40K
Max Output
ReasoningOpen Sourcetext

Pros & Cons

Sentiment0% +0% ·0% −

Pros

  • +Highest-scoring US open-weights model on AA Intelligence Index (48)
  • +1M-token context window for long agentic workflows
  • +Very high throughput (300+ tokens/sec, multiples faster than GLM-5.1 / Kimi K2.6)
  • +Open weights under permissive NVIDIA Open Model License
  • +Competitive open pricing ($0.50 / $2.50 per MTok)

Cons

  • Trails Chinese frontier open models (e.g. Kimi K2.6 at 54)
  • Text-only (no multimodal support)
  • 550B weights demand high-end hardware to self-host
  • New release, limited independent community feedback

Pricing

Input (per 1M tokens)$0.50
Output (per 1M tokens)$2.50
Free trial available
Updated on 2026-06-05

Get Started

1Visit the provider's website
2Create an account
3Start using the model

Benchmarks

artificialAnalysisIntelligenceIndex48%

Reliability

Incidents (30d)0
Open-source, reliability depends on hosting provider