Published 1 month ago • loading... • Updated 1 month ago

Nvidia's New Open Weights Nemotron 3 Super Combines Three Different Architectures to Beat Gpt-Oss and Qwen in Throughput

On Wednesday, Nvidia launched Nemotron 3 Super, a 120-billion-parameter model with a 1-million-token context window, publishing weights on Hugging Face and listing it on build.nvidia.com, Perplexity, and OpenRouter.
Kari Briski says companies face context explosion, prompting the new model, which Nvidia frames as reducing the enterprise 'thinking tax' by merging architectures for agentic workflows.
Technically, the model combines a Hybrid Mamba-Transformer backbone that interleaves Mamba-2 layers with Transformer attention, pretrained in NVFP4 for Blackwell GPUs.
Availability spans Nvidia's site, Hugging Face, and major cloud platforms, with enterprises like CodeRabbit and Greptile integrating Nemotron 3 Super as an Nvidia NIM microservice on-prem and across clouds including Google Cloud and Oracle, with AWS and Azure coming shortly.
Benchmarks report up to 2.2x higher throughput versus gpt-oss-120B and 7.5x versus Qwen3.5-122B, with Artificial Analysis measuring 478 output tokens per second, while Nvidia says the Nvidia Open Model License permits commercial use with termination triggers.

Insights by Ground AI

11 Articles

Nvidia's new open weights Nemotron 3 super combines three different architectures to beat gpt-oss and Qwen in throughput

Multi-agent systems, designed to handle long-horizon tasks like software engineering or cybersecurity triaging, can generate up to 15 times the token volume of standard chats — threatening their cost-effectiveness in handling enterprise tasks. But today, Nvidia sought to help solve this problem with the release of Nemotron 3 Super, a 120-billion-parameter hybrid model, with weights posted on Hugging Face.By merging disparate architectural philos…

1 month ago·San Francisco, United States

Read Full Article

The New Stack

Center

Nvidia launches Nemotron 3 Super, a 120B open model for large-scale AI systems

Ahead of its flagship GTC conference next week, Nvidia on Wednesday launched the second model in its open-weight Nemotron 3 family: Nemotron 3 Super, a 120-billion-parameter model with a 1-million-token context window, tuned for speed and efficiency. Last December, Nvidia debuted Nemotron 3 Nano, a smaller 30-billion-parameter model that used many of the same optimization techniques as the larger Super model. But where that smaller model was esp…

1 month ago

Read Full Article

Bitcoin Ethereum News

NVIDIA Nemotron 3 Super Launch Targets Enterprise AI Agent Market

The post NVIDIA Nemotron 3 Super Launch Targets Enterprise AI Agent Market appeared on BitcoinEthereumNews.com. Zach Anderson Mar 11, 2026 22:27 NVIDIA releases 120B-parameter Nemotron 3 Super with 5x throughput gains for agentic AI. Major enterprises including Siemens and Palantir already deploying. NVIDIA dropped its Nemotron 3 Super model on March 11, 2026, a 120-billion-parameter open-source AI system that claims 5x higher throughput than …

1 month ago

Read Full Article

indiekings.com

Nvidia Unleashes Nemotron 3 Super 120B: The Era of 1 Million Token Context Begins

We bring you the latest PC hardware, technology, gaming news and reviews.

1 month ago

Read Full Article

constellationr.com

Nvidia Nemotron: Much needed open-source model champion in US

Nvidia Nemotron: Much needed open-source model champion in US Larry Dignan Thu, 12 Mar 2026 - 03:36 Larry Dignan Editor in Chief of Constellation Insights Constellation Research Larry Dignan is Editor in Chief of Constellation Insights at Constellation Research, where he leads editorial coverage focused on enterprise technology, digital transformation, and emerging trends shaping the future of business. He oversees research-driven news, analysi…

1 month ago

Read Full Article

quantumzeitgeist.com

NVIDIA Model Addresses Context And Cost Challenges In Autonomous Agents

NVIDIA Nemotron 3 Super is a 120-billion-parameter model designed to address the increasing costs and context limitations impacting the development of complex, autonomous agent systems. The model aims to improve efficiency and accuracy for agents facing challenges with long workflows and maintaining alignment with original objectives.

1 month ago

Read Full Article

Think freely.Subscribe and get full access to Ground NewsSubscriptions start at $9.99/year