MiniMax M3
FrontierMiniMax•Released on 2026-05-31
MiniMax's next-generation multimodal foundation model, succeeding M2.7. Accepts text, image, and video inputs with text output and a 1M-token context window, built for long-horizon agentic work, coding, and long-context reasoning. Introduces 'MiniMax Sparse Attention' (MSA), with MiniMax-reported gains of 9.7x faster prefill and 15.6x faster decoding at 1M tokens versus M2.7. Priced at $0.30/1M input and $1.20/1M output. As of launch there are no independent third-party benchmark results yet.
Voice of the community
“MiniMax teases upcoming M3 model with a new sparse attention mechanism and a 15.6x long-context response speed boost.”
“M3 introduces 'MiniMax Sparse Attention' (MSA) — reintroducing sparse attention, an architecture MiniMax explicitly moved away from in its M2 generation.”
Core Specs
Scenario Scores
Pros & Cons
Pros
- +1M-token context window with native multimodal (text + image + video) input
- +MiniMax-reported 15.6x faster decoding / 9.7x faster prefill at 1M tokens vs M2.7 (MiniMax Sparse Attention)
- +Very cheap ($0.30/1M input, $1.20/1M output), with $0.06/MTok cached-input read
- +Built for long-horizon agentic and coding workflows
Cons
- −No independent third-party benchmarks at launch — speedup figures are MiniMax-supplied only
- −Proprietary model (weights not open source)
- −Documentation and community mainly Chinese-centric
- −Smaller Western ecosystem than Claude/GPT/Gemini