Published 2 days ago • loading... • Updated 24 hours ago

Google Reveals Dev-Focused Gemini 3.1 Flash Lite, Promises 'Best-in-Class Intelligence for Your Highest-Volume Workloads'

On Tuesday, Google launched Gemini 3.1 Flash-Lite, two weeks after Gemini 3.1 Pro, making it available in preview via the Gemini API in Google AI Studio and Vertex AI.
Designed for high-volume developer workloads, Flash-Lite serves low-latency tasks like email summarizing and data extraction with adjustable 'Thinking' levels for accuracy trade-offs.
Benchmarking shows Gemini 3.1 Flash-Lite can produce up to 363 tokens per second and supports a 1-million token context window for multimodal vision and video processing.
Google set pricing at $0.25 per 1 million input tokens and $1.50 per 1 million output tokens, making Flash-Lite 12x–16x cheaper for high-context workloads.
In a departure from prior launches, Google surprised some observers by releasing a Flash-Lite variant first instead of a more capable flagship, and despite being three tokens slower than Gemini 2.5 Flash-Lite, Flash-Lite still outpaces competitors as Google withheld agent benchmarks.

Insights by Ground AI

Podcasts & Opinions

Podcast Mention

Center

Techmeme Ride Home

Daily Tech and News podcast

Center

Daily Tech and News podcast

Anyone Want To Give Me A Betting Market Tip?

Techmeme Ride Home discuss Google’s Gemini 3.1 Flash‑Lite launch, pricing, and developer preview rollout

2 days ago

Listen to Full Episode Full Episode Unlock Timestamp

Get Vantage — Podcasts, Ratings, Timestamps

Podcasts & Opinions

21 Articles

Tech Radar

Center

Google reveals dev-focused Gemini 3.1 Flash Lite, promises 'best-in-class intelligence for your highest-volume workloads'

Gemini 3.1 Flash Lite offers outstanding cost efficiency, beating rival models in more than half of these benchmarks.

1 day ago·United Kingdom

Read Full Article

VentureBeat

Center

Google releases Gemini 3.1 Flash Lite at 1/8th the cost of Pro

Google's newest AI model is here: Gemini 3.1 Flash-Lite, and the biggest improvements this time around come in cost and speed, especially for enterprises and developers seeking to leverage powerful reasoning and multimodal capabilities from the U.S. search and cloud giant.Positioning it as the most cost-efficient and responsive model in the Gemini 3 series, Google is offering a solution built specifically for intelligence at scale. This launch a…

2 days ago·San Francisco, United States

Read Full Article

Tom's Guide

Center

Google just launched Gemini 3.1 Flash-Lite — 7 prompts to test its new 'Thinking' mode

Gemini 3.1 Flash-Lite has arrived — Google’s fastest AI model yet. Here are 7 powerful prompts to test its speed, “Thinking Levels” and multimodal skills right now.

2 days ago

Read Full Article

The New Stack

Center

Google launches Gemini 3.1 Flash-Lite, its fastest Gemini 3 model yet

Two weeks after launching Gemini 3.1 Pro, its most capable AI model yet, Google on Tuesday launched Gemini 3.1 Flash-Lite, the fastest model in the Gemini 3 family so far. At $0.25/$1.50 per million input/output tokens, it’s also Google’s most affordable Gemini 3 model yet. The model is meant for what Google describes as “high-volume developer workloads at scale” and is now available in preview in the Gemini API in Google AI Studio and Vertex AI…

2 days ago

Read Full Article

Israel Hayom

Even Better: Gemini's Fastest and Cheapest Model

Google announced a new artificial intelligence model in the Gemini 3 series • The model will be the cheapest yet and will provide an initial answer 2.5 times faster than existing models • Everything we know so far

24 hours ago

Read Full Article