The Real Lesson of DeepSeek's R1
R1 is just the latest data point indicating that superhuman AI will be easier and cheaper to build than most people think, and won't be monopolized.
Editor’s note: I recently pitched an op-ed like this to a few newspapers, hence the different style from my usual posts. It turns out that the op-ed market is a bit unpredictable, but these topics are time-sensitive so I’m just publishing it now here. I hope that if you find it useful, you’ll pass it along to other folks so that we can somewhat approximate the impact of an op-ed. :) Some claims here merit more evidence/analysis and I’ll circle back on them in more detail later.
The Chinese company DeepSeek’s recent release of R1 – which is now the most capable open source AI model – generated a lot of concern among American technologists, policymakers, and investors. Robust responses, especially tightening of American export controls on GPUs, are necessary given the increasing importance of AI to the global economy.
But I worry that amidst all the discussion of what’s new about R1, there has been too little discussion of the ways in which it’s just the latest chapter in the modern history of AI. That story started several years ago in 2018, around the time I was working at OpenAI on open sourcing the (now primitive) GPT-2 language model.
GPT-2, R1, and nearly every other AI model released since 2018 have all been part of a consistent story: AI capabilities that rival and ultimately exceed human intelligence are easier and cheaper to build than almost anyone can intuitively grasp, and this gets easier and cheaper every month. This lesson has deep policy significance. It means that even while competing vigorously with China for relative influence, the United States must come to terms with the fact that it will not have a monopoly on superhuman AI capabilities.
Scale and skill
If recent history in AI has taught us anything, it’s that “all” you need to develop AI systems that meet and ultimately exceed human capabilities are two ingredients: millions to billions of dollars worth of computing resources and data (scale), and dozens to hundreds of highly talented scientists and engineers (skill) applying that scale to the training of artificial neural networks.
In the first scaling paradigm, researchers and engineers train a big neural network to predict what comes next in a sentence, by trying to complete billions of sentences billions of times, and tweaking the parameters of the network after each mistake in order to get better. As a result of this process, AI models learn grammar, knowledge about the world, and rudimentary reasoning skills.
During my time at OpenAI, I saw that time after time, “just” scaling this paradigm up – an enormously skillful endeavor – would consistently yield improvements in performance, and, counterintuitively, improvements in the performance per dollar (e.g., if you train the same size model for longer, you are getting more value in the long run when you run it to serve users).
Each and every time, it took at most a few years but typically months before other companies would achieve the same level of performance using a broadly similar, and always relatively simple, recipe. Moreover, late-comers would typically replicate the achievement for much lower costs than what OpenAI originally spent, because they could build on all the tips and tricks that the global scientific community published. Ultimately these savings would flow back to the companies with the most computing power when they did a later training run, yielding 90%+ cost reductions on the OpenAI API over time.
Starting last year, OpenAI, DeepSeek, and other companies have started to show that scale also can lead to enormous improvements in the ability of AI systems to think through problems step-by-step in a long “chain of thought.” By trying over and over to solve complex problems in areas like math and coding, and getting rewarded for each success, o1 and R1 gradually learn reasoning techniques like breaking problems down into chunks and checking one’s work.
Graphs from DeepSeek and OpenAI, respectively, each show different aspects of scaling up the new paradigm. The graph from DeepSeek shows performance improving over the course of training, and improvements when more attempts are made to solve each problem. The graphs from OpenAI show improved capability as a result of applying more computing power during the training phase and during the inference phase, respectively.
The benefits of scale are why well-enforced and up-to-date export controls on GPUs are more important than ever. In fact, to a greater extent than with the last paradigm, reasoning models like o1 and R1 are hungry for computing power at the time of use (not just at the time of training), which is probably why DeepSeek ran into issues serving their app to this new wave of millions of users.
But the fact that skill and scale are all you need also means that there is no secret ingredient that the US can protect forever in order to have a monopoly on superhuman AI. AI capabilities will always be a matter of degree, and what you can accomplish with a given amount of scale is constantly changing.
Given what DeepSeek’s skilled researchers and engineers could achieve with (at most) tens of thousands of high-end GPUs, it’s clear that they will accomplish incredible things in the coming years with the greater number of GPUs they will get, a growing fraction of which will be domestically produced over time. Their ability to achieve these capabilities with fewer GPUs than their American counterparts required skill, but also is a continuation of the previous trend of ever greater performance and efficiency.
If the US maintains leadership in computing power and skillful research and engineering, it may have the best models and be able to serve them at a larger scale than China can. It might even be able to create dramatically better models than China, if automated engineers applied to AI lead to an “intelligence explosion.” But even second-place status in AI will confer very powerful military and economic capabilities to China, given just how easy AI has turned out to be. China is too far along in their development of semiconductor manufacturing, and they are too committed to attaining advanced AI capabilities, for them to be stopped entirely, short of a war that would be ruinous for both sides.
Competing while coexisting
With other commentators, I agree that Congress and the White House should ensure that the Bureau of Industry and Security has the funding, technical support (e.g., from DOGE), and political support that it needs to implement export controls effectively and update them rapidly as needed.
AI competition needs to be tempered with a recognition that China isn’t going away as a player in AI, and that coexistence must also be a part of America’s strategy. Even if China doesn’t surpass the US in scale, it has more than enough skill and scale to do a lot, and will be strategic in how it deploys the resources it has. This may look like, for example, putting particular effort into releasing efficient open source models or into military AI systems. Ironically, China may actually be better positioned than the US to dominate open source since Chinese companies will, because of export controls, naturally be thinking more in terms of efficiency than their “GPU-rich” American competitors.
Since China isn’t going away on AI, the United States will have to learn to cooperate with them in areas of common interest such as measuring and mitigating dangerous capabilities (e.g., models being misused to create biological weapons), sharing lessons learned in the area of AI alignment (to avoid very sophisticated AI systems evading human control), and negotiating (and ultimately verifiably enforcing) basic rules of the road on military AI to avoid unintended escalation of conflict. Again, this happened with the last wave of AI scaling, though with smaller stakes than the next wave: as competitors caught up with and in some cases surpassed OpenAI, there was a clear need to share lessons learned and develop common norms around safety and security.
Some have said that R1 is a Sputnik moment, and I agree, but this is true in a very specific sense: Sputnik did not demonstrate overall technological superiority of the Soviet Union, but rather the feasibility of reaching orbit given a serious effort by a fairly technologically advanced country. The US, China, India, and other countries also gradually took advantage of this new domain, and while competing, they still had to coexist in space (e.g., avoiding hitting each others’ satellites with debris). Rather than just being a domain of military competition, satellites served a wide range of national, international, military, and civilian purposes. The same will be true of AI.
There are signs that President Trump recognizes this: in his comments on DeepSeek, he said it was a wake-up call for Americans, but he also applauded DeepSeek on their achievement and said that it was in everyone’s interest to have cheaper AI. He’s right – in the same way that it’s in everyone’s interest to have cheaper satellites, as long as they aren’t crashing into each other.
Acknowledgments: Thanks to Jordan Schneider for helpful comments and for inviting me to the ChinaTalk podcast where I made some related points on DeepSeek.