Date: January 15th, 2026 10:39 PM
Author: chopped unc
Claude is the closest and GPT is absolutely blowing it away.
Abstract Reasoning (ARC-AGI-2)
The substantial lead is most visible in abstract reasoning, which allows a model to solve novel engineering problems it hasn't seen in its training data:
The Gap: GPT 5.2 Thinking scores 52.9%–54.2% on the ARC-AGI-2 benchmark.
Opus Lag: Claude Opus 4.5 scores only 37.6%.
Impact: This means in 2026, GPT 5.2 is roughly 40% better at handling "first-of-their-kind" software architectural challenges that require true first principles thinking rather than pattern matching.
(http://www.autoadmit.com/thread.php?thread_id=5822697&forum_id=2...#49593065)