Akamai to acquire LayerX to enforce AI usage control on any browser. Get details

Deepseek: Why it Matters and What the Press Got Wrong

Robert Blumofe

Feb 19, 2025

Robert Blumofe

Robert Blumofe

Written by

Robert Blumofe

Dr. Robert Blumofe is Executive Vice President and Chief Technology Officer at Akamai. As CTO, he guides Akamai’s technology strategy, works with Akamai’s largest customers, and convenes technology leaders within the company to catalyze innovation. Previously, he led Akamai’s Platform organization and Enterprise Division, where he was responsible for developing and operating the distributed system underlying all Akamai products and services, as well as creating solutions for major enterprises to secure and improve performance. He holds a Ph.D. in Computer Science from Massachusetts Institute of Technology and a Bachelor of Science from Brown University.

Share

DeepSeek recently made waves in the technology world with their new AI model, R1. This model showcases a reasoning capability comparable to OpenAI's o1, but with a notable distinction: DeepSeek claims that their model was trained at a significantly lower cost.

While there has been debate around whether DeepSeek is the real deal or a DeepFake, it’s clear that this is a wake-up call -- the path of ever-larger LLMs that rely on ever-increasing GPUs and massive amounts of energy is not the only path forward. In fact, it’s become obvious there is limited advantage to that approach, for a few reasons:

First, pure scaling of LLMs at training time has reached the point of diminishing returns or perhaps even near-zero returns. Bigger models trained with more data are not resulting in meaningful improvements. 

Further, enterprises don’t need massive, ask-me-anything LLMs for most use cases. Even prior to DeepSeek, there's a noticeable shift towards smaller, more specialized models tailored to specific business needs. As more enterprise AI use cases emerge, it becomes more about inference -- actually running the models to drive value.  In many cases, that will happen at the edge of the internet, close to end users.  Smaller models that are optimized to run on widely available hardware will create more value long-term than over-sized LLMs.

Finally, the LLM space is entering an era of optimization. The AI models we have seen so far have focused on innovation by scaling at any cost. Efficiency, specialization, and resource optimization are once again taking center stage, a signal that AI’s future isn’t about brute force alone, but in how strategically and efficiently that power is deployed. 

DeepSeek highlights this point very well in their technical papers, which showcase a tour de force of engineering optimization. Their advancements include modifications to the transformer architecture and techniques to optimize resource allocation during training. While these innovations move the field forward, these are incremental steps toward progress rather than a radical revolution of AI technology.  

And while the media is making a big deal about their advancements -- which are indeed noteworthy -- they are generally missing a key point: if DeepSeek hadn’t done this, someone else would have. And they are likely only the first in what will be a new wave of AI that leverages significant efficiency gains in both model training costs and size. 

It’s important that we put DeepSeek’s accomplishments in context. The company’s advancements are the latest step in a steady march that has been advancing the state of the art in LLM architecture and training for years. This is not a disruptive breakthrough. While the news was a wake-up call for many, it should have been expected by those paying close attention to industry trends. The reality is that in the two years since OpenAI trained GPT-4, the state of the art in training efficiency has advanced considerably. And it’s not just about hardware (GPUs); it’s about algorithms and software. So it should be no surprise a company – even a company like DeepSeek that does not have access to the latest and greatest GPUs – can now train models that are as good as GPT-4 at a much lower cost.

DeepSeek deserves credit for taking this step and for disclosing it so thoroughly, but it’s just another expected milestone in the technical evolution of AI that will be followed by many more. 

Robert Blumofe

Feb 19, 2025

Robert Blumofe

Robert Blumofe

Written by

Robert Blumofe

Dr. Robert Blumofe is Executive Vice President and Chief Technology Officer at Akamai. As CTO, he guides Akamai’s technology strategy, works with Akamai’s largest customers, and convenes technology leaders within the company to catalyze innovation. Previously, he led Akamai’s Platform organization and Enterprise Division, where he was responsible for developing and operating the distributed system underlying all Akamai products and services, as well as creating solutions for major enterprises to secure and improve performance. He holds a Ph.D. in Computer Science from Massachusetts Institute of Technology and a Bachelor of Science from Brown University.

Tags

Share

Related Blog Posts

Developers
What’s New for Developers: September 2022
September 23, 2022
Learn about Akamai’s voxel art contest, the updates to EdgeWorkers and EdgeKV demo sites, and how the beta Test Center CLI allows you to test the behavior of configuration changes on your own in this month’s blog.
Developers
What’s New for Developers: May 2023
May 24, 2023
Read about the new EdgeKV reports, Terraform Provider 3.6.0, and the Postman Edge Diagnostics API collection.
Developers
What’s New for Developers: April 2023
April 21, 2023
Read about the new EdgeWorkers and EdgeKV Postman collections, new cluster deployments in the Linode Marketplace, and Terraform Provider updates.