AI search agents often confirm what they already know instead of actually researching the web
2 Articles
2 Articles
AI search agents often confirm what they already know instead of actually researching the web
Leading AI search agents like GPT-5.4 and Kimi K2.6 don't appear to do much actual research on established benchmarks. They mostly just use the web to confirm what they already learned during training. Researchers at the Harbin Institute of Technology found this using a new time-based benchmark called LiveBrowseComp, which only asks about events from the last 90 days. Once the models can't fall back on memory, performance falls apart and the exi…
Leading AI search agents such as GPT-5.4 or Kimi K2.6 seem to hardly really do research on established benchmarks, but use the web mainly to confirm knowledge already learned during the training. Researchers at the Harbin Institute of Technology prove that with a new time-bound benchmark called LiveBrowseComp, which only asks questions about events of the last 90 days. As soon as the models can no longer rely on their memory, the performance bre…
Coverage Details
Bias Distribution
- There is no tracked Bias information for the sources covering this story.
Factuality
To view factuality data please Upgrade to Premium