xAI's Grok 4 Performs Poorly On A Dynamic Strategic Challenge, Shows Improvements In Reasoning Abilities
4 Articles
4 Articles
At the time ChatGPT began to popularize in the late 2022 and early 2023, we saw that the AI was going to stay with us and for a long time. There was a lot of criticism at the beginning, but we have already seen that this has been fulfilled and artificial intelligence has not stopped improving. We continue to see how ChatGPT is in first place in terms of popularity, but there are other AIs that are almost at its same level and surprisingly one of…
XAI Grok 4 Scoring Poorly in Some Realworld Tests
There is a common problem for all AI companies for overfitting to benchmarks. XAI Grok 4 has some problems with prompt adherence. XAI could have had overfitting resulted from the reinforcement learning used for the reasoning model work. Kimi K2 is doing well on realworld tests. XAI will likely improve Grok 4 with new versions ...
Elon Musk's AI company xAI has introduced a new version of its AI assistant Grok. Grok 4 is expected to have a native tool usage and real-time search function as well as to be able to process more complex requests. However, the new Supergrok Heavy subscription level has its price.
xAI's Grok 4 Performs Poorly On A Dynamic Strategic Challenge, Shows Improvements In Reasoning Abilities
xAI's Grok 4 AI model is all the rage these days, helped along by the incessant hype created by Elon Musk himself. Yet, under the hood, the model appears specifically gamified to ace AI benchmark tests, and falls flat when it encounters a dynamic, strategic challenge. xAI's Grok 4 has already managed to embroil itself in a number of controversies, despite debuting on the market just a few days back. For instance, Grok 4 attracted eye balls a few…
Coverage Details
Bias Distribution
- There is no tracked Bias information for the sources covering this story.
To view factuality data please Upgrade to Premium