Why DeepSeek's AI leap puts China in front for now

Commentators are right to say DeepSeek's new AI chatbot is a game changer but do not take all the hype about China now dominating the field too seriously.

Jonathan K. Kummerfeld (The Jakarta Post)

360info/Sydney, Australia

Wed, February 5, 2025 Published on Feb. 4, 2025 Published on 2025-02-04T13:54:53+07:00

This photo illustration shows the DeepSeek app on a mobile phone in Hong Kong on Jan. 28, 2025. (AFP/Mladen Antonov)

T

he knee-jerk reaction to the release of Chinese company DeepSeek's artificial intelligence chatbot mistakenly assumes it gives China an enduring lead in AI development and misses key ways it could drive demand for AI hardware.

The DeepSeek model was unveiled at the end of January, offering an AI chatbot competitive with the United States's OpenAI's leading model, o1, which drives ChatGPT today.

DeepSeek's model offered major advances in the way it uses hardware, including using far fewer and less powerful chips than other models, and in its learning efficiency, making it much cheaper to create.

The announcement dominated the international media cycle and commentators frequently suggested that the arrival of DeepSeek would dramatically cut demand for AI chips.

The Deep Seek announcement also triggered a plunge in US tech stocks that wiped nearly US$600 billion off the value of leading chipmaker Nvidia.

This dramatic reaction misses four ways DeepSeek's innovation could actually expand demand for AI hardware:

By cutting the resources needed to train a model, more companies will be able to train models for their own needs and avoid paying a premium for access to the big tech models.

The big tech companies could combine the more efficient training with larger resources to further improve performance.

Researchers will be able to expand the number of experiments they do without needing more resources.

OpenAI and other leading model providers could expand their range of models, switching from one generic model – essentially a jack-of-all-trades like we have now – to a variety of more specialized models, for example one optimized for scientists versus another made for writers.

Researchers around the world have been exploring ways to improve the performance of AI models.

Innovations in the core ideas are widely published, allowing researchers to build on each other's work.

DeepSeek has brought together and extended a range of ideas, with the key advances in hardware and the way learning works.

DeepSeek uses the hardware more efficiently. When training these large models, so many computers are involved that communication between them can become a bottleneck. Computers sit idle, wasting time while waiting for communication. DeepSeek developed new ways to do calculations and communication at the same time, avoiding downtime.

It has also brought innovation to how learning works. All large language models today have three phases of learning.

First, the language model learns from vast amounts of text, attempting to predict the next word and getting updated if it makes a mistake. It then learns from a much smaller set of specific examples that enables the large language model to be able to communicate with users conversationally. Finally, the language model learns by generating output, being judged and adjusting in response.

In the last phase, there is no single correct answer in each step of learning. Instead, the model is learning that one output is better or worse than another.

DeepSeek's method compares a large set of outputs in the last phase of learning, which is effective enough to allow the second and third stages to be much shorter and achieve the same results.

Combined, these improvements dramatically improve efficiency.

One option is to train and run any existing AI model using DeepSeek's efficiency gains to reduce the costs and environmental impacts of the model while still being able to achieve the same results.

We could also use DeepSeek innovations to train better models. That could mean scaling these techniques up to more hardware and longer training, or it could mean making a variety of models, each suited for a specific task or user type.

There is still a lot we do not know.

DeepSeek's work is more open source than OpenAI because it has released its models, yet it is not truly open source like the non-profit Allen Institute for AI’s OLMo models that are used in their Playground chatbot.

Critically, we know very little about the data used in training. Microsoft and OpenAI are investigating claims some of their data may have been used to make DeepSeek's model. We also do not know who has access to the data that users provide to their website and app.

There are also elements of censorship in the DeepSeek model. For example, it will refuse to discuss free speech in China. The good news is that DeepSeek has published descriptions of its methods so researchers and developers can use the ideas to create new models, with no risk of DeepSeek's biases transferring.

The DeepSeek development is another significant step along AI's overall trajectory, but it is not a fundamental step-change like the switch to machine learning in the 1990s or the rise of neural networks in the 2010s.

It is unlikely that this will lead to an enduring lead for DeepSeek in AI development.

DeepSeek's success shows that AI innovation can happen anywhere with a team that is technically sharp and well-funded. Researchers around the world will continue to compete, with the lead moving back and forth between companies.

For consumers, DeepSeek could also be a step toward greater control of your own data and more personalized models.

Recently, Nvidia announced DIGITS, a desktop computer with enough computing power to run large language models.

If the computing power on your desk grows and the scale of models shrinks, users might be able to run a high-performing large language model themselves, eliminating the need for data to even leave the home or office.

And that is likely to lead to more use of AI, not less.

---

The writer is a senior lecturer in the School of Computer Science at The University of Sydney. The article is republished under a Creative Commons license. The views expressed are personal.

Explainer: Social media as town square, battleground in Indonesia’s protests

Reshuffle sees Prabowo, allies close ranks

TheJakartaPost

Why DeepSeek's AI leap puts China in front for now

T

Popular

Explainer: Social media as town square, battleground in Indonesia’s protests

Reshuffle sees Prabowo, allies close ranks

Let consumers choose

Popular

Explainer: Social media as town square, battleground in Indonesia’s protests

Reshuffle sees Prabowo, allies close ranks

Let consumers choose

More in Opinion

Under Prabowo, Indonesia’s Navy sets sail for bigger ambitions

Back to barracks and back to basics: The only path to a professional TNI

Analysis: Investor flight, policy shifts undermine the capital market

Highlight

New presidential adviser leaves doubt on police reform

Rising militarism

Govt revises 2026 state budget, increases regional spending

The Latest

House to prioritize on-demand transportation bill next year

Under Prabowo, Indonesia’s Navy sets sail for bigger ambitions

Smart screens expose paradox in education policies

Back to barracks and back to basics: The only path to a professional TNI

Analysis: Investor flight, policy shifts undermine the capital market

The real co-sleeping problem isn't your baby, it's your partner

Govt revises 2026 state budget, increases regional spending

RI faces rough road in phasing out EV imports at turn of year

Your Opinion Matters

Thank You

TheJakartaPost

Popular Reads

Top Results

Popular Reads

Top Results

Why DeepSeek's AI leap puts China in front for now

Share this article

Change text size

Gift Premium Articles to Anyone

T

Viewpoint

Every Thursday

Thank You

for signing up our newsletter!

Popular

Explainer: Social media as town square, battleground in Indonesia’s protests

Reshuffle sees Prabowo, allies close ranks

Let consumers choose

Related Articles

Trump extends delay on US TikTok ban until mid-December

How Europe’s deforestation law could change global coffee trade

AI apps are changing fitness, but can they replace your coach?

Nuclear power: Secure it with proven technology

Two ambassadors under one roof

Related Article

Trump extends delay on US TikTok ban until mid-December

How Europe’s deforestation law could change global coffee trade

AI apps are changing fitness, but can they replace your coach?

Nuclear power: Secure it with proven technology

Two ambassadors under one roof

Popular

Explainer: Social media as town square, battleground in Indonesia’s protests

Reshuffle sees Prabowo, allies close ranks

Let consumers choose

More in Opinion

Under Prabowo, Indonesia’s Navy sets sail for bigger ambitions

Back to barracks and back to basics: The only path to a professional TNI

Analysis: Investor flight, policy shifts undermine the capital market

Highlight

New presidential adviser leaves doubt on police reform

Rising militarism

Govt revises 2026 state budget, increases regional spending

The Latest

House to prioritize on-demand transportation bill next year

Under Prabowo, Indonesia’s Navy sets sail for bigger ambitions

Smart screens expose paradox in education policies

Back to barracks and back to basics: The only path to a professional TNI

Analysis: Investor flight, policy shifts undermine the capital market

The real co-sleeping problem isn't your baby, it's your partner

Govt revises 2026 state budget, increases regional spending

RI faces rough road in phasing out EV imports at turn of year

Your Opinion Matters

Thank You

Share options

Change text size options

Gift Premium Articlesto Anyone

Continue in the app

Gift Premium Articles
to Anyone

Gift Premium Articles
to Anyone