Published 6 days ago • loading... • Updated 5 days ago

MOREH Demonstrates Production-Ready LLM Inference on Tenstorrent Galaxy, Achieving DGX A100-Class Performance with Improved Cost Efficiency

On Friday, May 1, 2026, AI infrastructure company Moreh and CEO Gangwon Jo announced validated LLM inference performance on the Tenstorrent Galaxy Wormhole system using the proprietary MoAI Inference Framework at the TT-Deploy launch event in San Francisco.
The MoAI Inference Framework enables unified operation of heterogeneous GPUs and NPUs—including NVIDIA, AMD, and Tenstorrent chips—within a single cluster, allowing enterprises to build flexible AI infrastructure strategies without vendor lock-in.
Tests across leading Mixture-of-Experts models—including GPT-OSS, Qwen, GLM, and DeepSeek—showed the system matches or surpasses NVIDIA DGX A100-class performance, demonstrating a viable alternative to conventional GPU-centric infrastructure.
By utilizing Tenstorrent processors as dedicated prefill accelerators, the company reduced reliance on high-cost HBM, improving overall cost efficiency while maintaining production-grade stability for real-world data centers.
"Achieving production-grade LLM inference performance and stability on Tenstorrent-based systems marks a significant milestone," Jo stated, adding the company intends deeper optimization across heterogeneous architectures and closer integration with Tenstorrent NPUs.

Insights by Ground AI

20 Articles

Aspen Daily News

+18 Reposted by 18 other sources

Center

MOREH Demonstrates Production-Ready LLM Inference on Tenstorrent Galaxy, Achieving DGX A100-Class Performance with Improved Cost Efficiency

Reduces HBM Costs with GPU–Tenstorrent Heterogeneous Distributed Serving

6 days ago

Read Full Article

vir.com.vn

MOREH achieves DGX A100-class performance on Tenstorrent chips

SANTA CLARA, Calif., May 2, 2026 /PRNewswire/ -- Moreh, an AI infrastructure software company, led by CEO Gangwon Jo, announced that it has successfully validated LLM inference performance on the Tenstorrent Galaxy Wormhole system using its proprietary 'MoAI Inference Framework.' Based on tests across leading Mixture-of-Experts (MoE) models—including GPT-OSS, Qwen, GLM, and DeepSeek—Moreh achieved LLM inference performance on Tenstorrent Galaxy …

5 days ago

Read Full Article

Think freely.Subscribe and get full access to Ground NewsSubscriptions start at $9.99/year