Summarised by Centrist
OpenAI’s new artificial intelligence (AI) system o3 model scored 85% on a test that is designed to measure “general intelligence.” This marks “a notable improvement” over the previous AI best of 55%, and is on par with the average human score.
This breakthrough has raised the possibility of achieving Artificial General Intelligence (AGI), which is the goal of many major AI research labs.
If achieved, AGI would match human intelligence, capable of reasoning, adapting, and solving problems in practically any area. In contrast, Large Language Models (LLMs) like ChatGPT are specialised in understanding and generating text but lack generalisation and true adaptability. For example, an LLM can write a recipe using learned patterns, while AGI could invent a recipe from scratch with unfamiliar ingredients.
The ARC-AGI test evaluates an AI’s ability to generalise, or adapt to new situations with minimal examples—an essential aspect of intelligence.
Unlike current models, the o3 model has demonstrated impressive “sample efficiency,” learning from just a few examples and successfully solving problems that it has never encountered before.
This ability to generalise is considered a vital step toward AGI, where AI can apply learned knowledge to a wide range of tasks.