Predictions for 2026

Dec 22, 2025

2025 was a year of stunningly fast AI progress.

In December 2024, the best reasoning model was OpenAI’s o1, a toy reasoning model that wasn’t even particularly proficient at using tools. By September 2025, OpenAI’s unreleased general reasoning models had won gold medals on the 2025 International Mathematics Olympiad (IMO), the 2025 International Olympiad in Informatics (IOI), and the 2025 International Collegiate Programming Contest (ICPC) World Finals. Another unreleased OpenAI model won second place in the AtCoder World Finals, working fully autonomously without human intervention for the entire 10 hours of the competition. And coding agents - including, in particular, Claude Code - have taken the world of coding by storm, while also meaningfully accelerating the pace of AI research at the frontier labs.

We have also begun to see glimpses of AI meaningfully contributing to work in fields other than coding. Starting in late Q3 2025, I began using GPT-5.x Pro for legal research and analysis, and am now finding it absolutely essential to my work. I am also increasingly seeing reports that Google’s NotebookLM is fantastic at generating presentations and data tables, which is another important enterprise use case. And even non-technical people (yes, including yours truly) are discovering “Claude Code for things that are not coding”.

Where does this lead us in 2026? Here are some predictions:

Automation of AI research

Earlier this year, roon1 played with Codex for the first time and “realiz[ed] we’re in the takeoff”. In 2026, agentic coding tools like Codex and Claude Code will continue accelerating frontier lab researchers. By September 2026, OpenAI intends to have this effort culminate in an automated AI research intern running on hundreds of thousands of GPUs, which will be able to automatically handle the implementation and debugging of research ideas proposed by OpenAI’s human researchers.2

Continual learning

In Q3 2025, public consensus suddenly decreed that continual learning is required to achieve AGI. Andrej Karpathy said that current LLMs are “cognitively lacking” due to lack of continual learning, and “it’s not working” - placing AGI about a decade away. Later, Ilya Sutskever added fuel to the fire when he revealed that SSI is working on developing AI capable of continual learning - which he said is “5 to 20 years” away.

Relatively unnoticed among all the hoopla were comments on continual learning from Anthropic’s CEO, Dario Amodei:

One thing we learned in AI is whenever it feels like there’s some fundamental obstacle - like two years ago we thought there was this fundamental obstacle around reasoning - turned out just to be be RL, you just train with RL and you let the model write things down to try and figure out objective math problems…Without being too specific, we already have maybe some evidence to suggest that [continual learning] is another of those problems that is not as difficult as it seems that will fall to scale plus a slightly different way of thinking about things.

And just a few days ago, Sholto Douglas, an Anthropic employee, dropped a bombshell with his prediction that “continual learning [will get] solved in a satisfying way” in 2026.

Does this mean that Anthropic already knows how to achieve continual learning? We’ll find out next year.

Recursive self-improvement

Mark Chen recently mentioned that OpenAI is aggressively scaling up several bets, including one related to synthetic data. This was a reference to Sebastien Bubeck’s brief cameo during the GPT-5 launch livestream, in which he revealed that OpenAI has developed “new training techniques” whereby o3 had generated synthetic data to train GPT-5 in a way “raw web data just never could”. “This interaction between models foreshadows a recursive self-improvement loop”, Bubeck said.

Google DeepMind is also working in the same direction, according to Sebastian Borgeaud, pre-training lead for Gemini 3:

One really interesting question is whether you can actually generate synthetic data to make a model that you want to train in the future better than the model that generated the synthetic data in the first place. We spend a lot of time thinking about this and doing research in this direction.

It is unclear where these efforts will lead in 2026, but needless to say that this is an area of ML research that is well worth monitoring.

AI is coming to the workplace (not just for coders)

Here’s Sholto Douglas again:

The most striking thing about next year is that the other forms of knowledge work are going to experience what software engineers are feeling right now, where they went from typing most of their lines of code at the beginning of the year to typing barely any of them at the end of the year. I think of this as the Claude Code experience, but for all forms of knowledge work.

Those who follow me on X know that I have been crying out for an interface that would enable even a non-technical lawyer to “vibe-code” a stock purchase agreement (see, e.g., point 3 here). It looks as though my wish may finally come true in 2026.

But it will be more than that, of course. Anthropic’s goal for 2026 is to develop and sell to enterprises a “virtual co-worker that is in all your Slack channels and can join your meetings and can work alongside you”. Some of us will be seeing these “virtual co-workers” join our companies next year.

And as for coding…

I am not a coder, but, as an outside observer, I can easily tell that several significant “vibe shifts” occurred in 2025 around using agentic tools like Claude Code and Codex for coding tasks. Claude Opus 4.5 in particular smashed the METR 50%-time horizon benchmark, and appears to be a huge step change when compared to the previous generation of models - to the point where some debate may be had as to whether Opus 4.5 in Claude Code is “basically AGI” (by OpenAI’s definition: a highly autonomous system that outperforms humans at most economically valuable work).

It seems clear that the models will continue to improve at a rapid pace from here vis-a-vis coding. Expect software engineering to “go[] utterly wild next year”.

AI for science

Starting late this year, there has been an increasing cadence of reports that models like GPT-5 Pro can be leveraged effectively as a tool by human mathematicians to help with making relatively minor advances in mathematics. As models continue to improve next year, OpenAI expects that its AI systems “may be able to make small new [scientific] discoveries” in 2026. Indeed, work on these initiatives is ongoing at multiple frontier labs: for example, Anthropic has begun hiring “wet lab wizards” for its life sciences team.

But can LLM actually autonomously generate novel scientific hypotheses? In my view, the answer is almost certainly “yes”. We have already seen that even Gemini 2.0 Pro, when equipped with a great harness, can propose a novel scientific hypothesis pertaining to a complex gene transfer mechanism.3 The general rule of thumb that I think makes sense to follow is that anything an LLM can do with a harness will eventually also be achievable by a more powerful LLM without any harness whatsoever; the only (important!) question that remains open is the timeline by which this feat would become possible to accomplish.

OpenAI has declared that 2026 will be the “Year of AI and Science”. Let’s hope that the year can live up to this lofty title!

The robots are coming?

There’s been a lot of hoopla around humanoid (or otherwise) robots over the past few years, but very few of these advances have thus far made it out into the real world. I remain somewhat unconvinced that 2026 will be the year when robots truly proliferate in the real world at scale, but it’s possible that I am too pessimistic in this regard. Google DeepMind apparently projects that 2026 will be “a huge year” for embodied AI, and that there will be “a lot more robots in the real world soon”. Other knowledgeable commentators expect that 2026 will see at least “the first test deployments of home robots”.

* * *

I will end here with an overall observation. Over the past few months, it has become decidedly fashionable to update one’s views towards longer timelines for “AGI” (whatever that term might mean). If significant progress is made on automation of AI research and/or continual learning in 2026, these longer timelines will likely begin to feel extremely - maybe even needlessly - conservative by the end of the year. In particular, OpenAI’s stated goal of fully automating AI research in just slightly more than two years’ time still has not been - but should be - fully internalized by most industry observers and commentators. Should OpenAI successfully develop and deploy an automated AI research “intern” during 2026, a realization may suddenly come to many that the long-expected promise of the machine taking over the building of other, yet more powerful, machines has come to the verge of being fulfilled.

A famous semi-anon OpenAI employee, @tszzl on X.

Just 18 months later, by March 2028, OpenAI expects to develop a fully end-to-end automated AI researcher.

The related paper by Penades et al. is available here: https://www.sciencedirect.com/science/article/pii/S0092867425009730.

Federico

Dec 22

Thanks for this post. I've been following your posts on X, and I didn't know you had a Substack! Great to know.

By the way, as a non-coder, have you experimented with the coding tools from the frontier labs for non-coding purposes? I know the coders have found ways to get these tools to do amazing things for non-coding tasks, but it seems hard to find a way into the hashtag-and-ampersand thicket. A guide based on your own experience would be amazing.

2 replies by prinz and others

Dylan Black

Dec 22Edited

AI intern during 2026 would involve a significant speedup though, yes?

Take Opus 4.5 on METR, it can do ~5 hour tasks and the doubling time is 7 months. A reasonable estimation for an easy intern project is… oh let’s say 2-4 weeks. A hard project might take 3-6 months. Thus, with *exponential* extrapolation of current trends (which is always risky), this predicts that a mediocre AI research intern will be achieved in ~5 doublings, and a good intern in ~7.

That is, the aggressive (though not super-exponential) predictions should be ~3 years until an AI intern can do a 2 week project, and another year or so until it can do a 3 month project, ~autonomously. And of course, the ~3 month project would get done faster because computers think faster.

I wouldnt be surprised if an AI-augmented intern was exceptionally productive in 2026 though—i feel as though my productivity has increased enormously from deploying agents even for non-code tasks.

4 more comments...

prinz

Discussion about this post

Ready for more?