Unbiased News Awaits.
Published loading...Updated

Meta unleashes Llama API running 18x faster than OpenAI: Cerebras partnership delivers 2,600 tokens per second

  • Meta unveiled the Llama API at its inaugural LlamaCon developer conference on April 29, 2025, in Menlo Park, California.
  • Meta developed the API to offer faster, cost-efficient AI model inference amid growing competition from OpenAI, Google, and emerging rivals like DeepSeek.
  • The Llama API runs on partner Cerebras’ specialized hardware, delivering speeds up to 2,648 tokens per second, approximately 18 times faster than OpenAI’s ChatGPT.
  • Meta emphasized its open-model approach by allowing customers to transfer custom models and pledging not to use customer data for training its own models.
  • This launch marks Meta's shift from solely providing open models to selling AI services, aiming to create new revenue streams and compete in a fast-growing AI inference market.
Insights by Ground AI
Does this summary seem wrong?

20 Articles

All
Left
1
Center
3
Right
Think freely.Subscribe and get full access to Ground NewsSubscriptions start at $9.99/yearSubscribe

Bias Distribution

  • 75% of the sources are Center
75% Center
Factuality

To view factuality data please Upgrade to Premium

Ownership

To view ownership data please Upgrade to Vantage

TechCrunch broke the news in United States on Tuesday, April 29, 2025.
Sources are mostly out of (0)