Frontier AI
Tracked metric

Abstract-reasoning score (ARC-AGI)

Score on ARC-AGI-1 — puzzles easy for humans (~85%) but long resistant to AI. In Dec 2024 OpenAI's o3 reached 76–88%, the first AI to move beyond memorization on this test. ARC-AGI-2 is the harder successor, where frontier models still score low. (A third-party benchmark, not our score.)

88%Reached · Dec 2024 ~85% (human)Average human performance on ARC-AGI-1 — the bar AI crossed in late 2024.
View on the tracker
  1. 88 % Dec 2024 OpenAI Press · 2024-12
  2. 0 % 2020 Press · 2024-12

More on this

Explainer Apr 10, 2026

Why we don't score "AGI"

There is no agreed test for general intelligence, so a single "AGI %" would be our opinion dressed as data. Instead we track objective, third-party numbers: training compute, public benchmark scores, and investment.