Deepseek: Why it Matters and What the Press Got Wrong

Robert Blumofe

Feb 19, 2025

Robert Blumofe

Robert Blumofe

Written by

Robert Blumofe

Dr. Robert Blumofe is Executive Vice President and Chief Technology Officer at Akamai. As CTO, he guides Akamai’s technology strategy, works with Akamai’s largest customers, and convenes technology leaders within the company to catalyze innovation. Previously, he led Akamai’s Platform organization and Enterprise Division, where he was responsible for developing and operating the distributed system underlying all Akamai products and services, as well as creating solutions for major enterprises to secure and improve performance. He holds a Ph.D. in Computer Science from Massachusetts Institute of Technology and a Bachelor of Science from Brown University.

Share

DeepSeek recently made waves in the technology world with their new AI model, R1. This model showcases a reasoning capability comparable to OpenAI's o1, but with a notable distinction: DeepSeek claims that their model was trained at a significantly lower cost.

While there has been debate around whether DeepSeek is the real deal or a DeepFake, it’s clear that this is a wake-up call -- the path of ever-larger LLMs that rely on ever-increasing GPUs and massive amounts of energy is not the only path forward. In fact, it’s become obvious there is limited advantage to that approach, for a few reasons:

First, pure scaling of LLMs at training time has reached the point of diminishing returns or perhaps even near-zero returns. Bigger models trained with more data are not resulting in meaningful improvements. 

Further, enterprises don’t need massive, ask-me-anything LLMs for most use cases. Even prior to DeepSeek, there's a noticeable shift towards smaller, more specialized models tailored to specific business needs. As more enterprise AI use cases emerge, it becomes more about inference -- actually running the models to drive value.  In many cases, that will happen at the edge of the internet, close to end users.  Smaller models that are optimized to run on widely available hardware will create more value long-term than over-sized LLMs.

Finally, the LLM space is entering an era of optimization. The AI models we have seen so far have focused on innovation by scaling at any cost. Efficiency, specialization, and resource optimization are once again taking center stage, a signal that AI’s future isn’t about brute force alone, but in how strategically and efficiently that power is deployed. 

DeepSeek highlights this point very well in their technical papers, which showcase a tour de force of engineering optimization. Their advancements include modifications to the transformer architecture and techniques to optimize resource allocation during training. While these innovations move the field forward, these are incremental steps toward progress rather than a radical revolution of AI technology.  

And while the media is making a big deal about their advancements -- which are indeed noteworthy -- they are generally missing a key point: if DeepSeek hadn’t done this, someone else would have. And they are likely only the first in what will be a new wave of AI that leverages significant efficiency gains in both model training costs and size. 

It’s important that we put DeepSeek’s accomplishments in context. The company’s advancements are the latest step in a steady march that has been advancing the state of the art in LLM architecture and training for years. This is not a disruptive breakthrough. While the news was a wake-up call for many, it should have been expected by those paying close attention to industry trends. The reality is that in the two years since OpenAI trained GPT-4, the state of the art in training efficiency has advanced considerably. And it’s not just about hardware (GPUs); it’s about algorithms and software. So it should be no surprise a company – even a company like DeepSeek that does not have access to the latest and greatest GPUs – can now train models that are as good as GPT-4 at a much lower cost.

DeepSeek deserves credit for taking this step and for disclosing it so thoroughly, but it’s just another expected milestone in the technical evolution of AI that will be followed by many more. 

Robert Blumofe

Feb 19, 2025

Robert Blumofe

Robert Blumofe

Written by

Robert Blumofe

Dr. Robert Blumofe is Executive Vice President and Chief Technology Officer at Akamai. As CTO, he guides Akamai’s technology strategy, works with Akamai’s largest customers, and convenes technology leaders within the company to catalyze innovation. Previously, he led Akamai’s Platform organization and Enterprise Division, where he was responsible for developing and operating the distributed system underlying all Akamai products and services, as well as creating solutions for major enterprises to secure and improve performance. He holds a Ph.D. in Computer Science from Massachusetts Institute of Technology and a Bachelor of Science from Brown University.

Tags

Share

Related Blog Posts

Developers
Introducing Akamai Cloud Pulse: Observability for Your Cloud Infrastructure – Now in Open Beta
July 15, 2025
Akamai Cloud Pulse is now entering Open Beta for all Akamai Managed Database customers. Following successful closed beta testing, we are ready to provide you with real-time insights into your database performance and resource utilization.
Cloud
No Lag, All Frag: Level Up Your Gaming with Xonotic, K3s, and Edge Computing
March 20, 2025
Let’s set the scene for a gamer: you’re having the game of your life (and you wish you were streaming today of all days). You’re lining up the perfect Level up your gaming with Xonotic, K3s, and edge computing! Discover how to host a high-performance, low-latency game server using Akamai Cloud’s distributed compute regions. Say goodbye to lag and hello to seamless gameplay.
Cloud
An Inside Look at our Next Gen Object Storage Launch
August 28, 2025