Nvidia's New Open Weights Nemotron 3 Super Combines Three Different Architectures to Beat Gpt-Oss and Qwen in Throughput
Nemotron 3 Super boosts throughput by up to 2.2x over competitors with a triple hybrid architecture optimized for Nvidia Blackwell GPUs to reduce enterprise AI costs.
- On Wednesday, Nvidia launched Nemotron 3 Super, a 120-billion-parameter model with a 1-million-token context window, publishing weights on Hugging Face and listing it on build.nvidia.com, Perplexity, and OpenRouter.
- Kari Briski says companies face context explosion, prompting the new model, which Nvidia frames as reducing the enterprise 'thinking tax' by merging architectures for agentic workflows.
- Technically, the model combines a Hybrid Mamba-Transformer backbone that interleaves Mamba-2 layers with Transformer attention, pretrained in NVFP4 for Blackwell GPUs.
- Availability spans Nvidia's site, Hugging Face, and major cloud platforms, with enterprises like CodeRabbit and Greptile integrating Nemotron 3 Super as an Nvidia NIM microservice on-prem and across clouds including Google Cloud and Oracle, with AWS and Azure coming shortly.
- Benchmarks report up to 2.2x higher throughput versus gpt-oss-120B and 7.5x versus Qwen3.5-122B, with Artificial Analysis measuring 478 output tokens per second, while Nvidia says the Nvidia Open Model License permits commercial use with termination triggers.
11 Articles
11 Articles
Nvidia's new open weights Nemotron 3 super combines three different architectures to beat gpt-oss and Qwen in throughput
Multi-agent systems, designed to handle long-horizon tasks like software engineering or cybersecurity triaging, can generate up to 15 times the token volume of standard chats — threatening their cost-effectiveness in handling enterprise tasks. But today, Nvidia sought to help solve this problem with the release of Nemotron 3 Super, a 120-billion-parameter hybrid model, with weights posted on Hugging Face.By merging disparate architectural philos…
Nvidia launches Nemotron 3 Super, a 120B open model for large-scale AI systems
Ahead of its flagship GTC conference next week, Nvidia on Wednesday launched the second model in its open-weight Nemotron 3 family: Nemotron 3 Super, a 120-billion-parameter model with a 1-million-token context window, tuned for speed and efficiency. Last December, Nvidia debuted Nemotron 3 Nano, a smaller 30-billion-parameter model that used many of the same optimization techniques as the larger Super model. But where that smaller model was esp…
NVIDIA Nemotron 3 Super Launch Targets Enterprise AI Agent Market
The post NVIDIA Nemotron 3 Super Launch Targets Enterprise AI Agent Market appeared on BitcoinEthereumNews.com. Zach Anderson Mar 11, 2026 22:27 NVIDIA releases 120B-parameter Nemotron 3 Super with 5x throughput gains for agentic AI. Major enterprises including Siemens and Palantir already deploying. NVIDIA dropped its Nemotron 3 Super model on March 11, 2026, a 120-billion-parameter open-source AI system that claims 5x higher throughput than …
Nvidia Nemotron: Much needed open-source model champion in US
Nvidia Nemotron: Much needed open-source model champion in US Larry Dignan Thu, 12 Mar 2026 - 03:36 Larry Dignan Editor in Chief of Constellation Insights Constellation Research Larry Dignan is Editor in Chief of Constellation Insights at Constellation Research, where he leads editorial coverage focused on enterprise technology, digital transformation, and emerging trends shaping the future of business. He oversees research-driven news, analysi…
NVIDIA Model Addresses Context And Cost Challenges In Autonomous Agents
NVIDIA Nemotron 3 Super is a 120-billion-parameter model designed to address the increasing costs and context limitations impacting the development of complex, autonomous agent systems. The model aims to improve efficiency and accuracy for agents facing challenges with long workflows and maintaining alignment with original objectives.
Coverage Details
Bias Distribution
- 100% of the sources are Center
Factuality
To view factuality data please Upgrade to Premium





