China’s DeepSeek has some big AI claims; not all experts are convinced

jtimes 1 week ago

0 0 5 minutes read

DeepSeek app unavailable in Apple and Google app stores in Italy

Chinese artificial intelligence firm DeepSeek rocked markets this week with claims its new AI model outperforms OpenAI’s and cost a fraction of the price to build.

The assertions — specifically that DeepSeek’s large language model cost just $5.6 million to train — have sparked concerns over the eyewatering sums that tech giants are currently spending on computing infrastructure required to train and run advanced AI workloads.

But not everyone is convinced by DeepSeek’s claims.

CNBC asked industry experts for their views on DeepSeek, and how it actually compares to OpenAI, creator of viral chatbot ChatGPT which sparked the AI revolution.

What is DeepSeek?

Last week, DeepSeek released R1, its new reasoning model that rivals OpenAI’s o1. A reasoning model is a large language model that breaks prompts down into smaller pieces and considers multiple approaches before generating a response. It is designed to process complex problems in a similar way to humans.

DeepSeek was founded in 2023 by Liang Wenfeng, co-founder of AI-focused quantitative hedge fund High-Flyer, to focus on large language models and reaching artificial general intelligence, or AGI.

AGI as a concept loosely refers to the idea of an AI that equals or surpasses human intellect on a wide range of tasks.

Much of the technology behind R1 isn’t new. What is notable, however, is that DeepSeek is the first to deploy it in a high-performing AI model with — according to the company — considerable reductions in power requirements.

“The takeaway is that there are many possibilities to develop this industry. The high-end chip/capital intensive way is one technological approach,” said Xiaomeng Lu, director of Eurasia Group’s geo-technology practice.

“But DeepSeek proves we are still in the nascent stage of AI development and the path established by OpenAI may not be the only route to highly capable AI.”

How is it different from OpenAI?

DeepSeek has two main systems that have garnered buzz from the AI community: V3, the large language model that unpins its products, and R1, its reasoning model.

Both models are open-source, meaning their underlying code is free and publicly available for other developers to customize and redistribute.

DeepSeek’s models are much smaller than many other large language models. V3 has a total of 671 billion parameters, or variables that the model learns during training. And while OpenAI doesn’t disclose parameters, experts estimate its latest model to have at least a trillion.

In terms of performance, DeepSeek says its R1 model achieves performance comparable to OpenAI’s o1 on reasoning tasks, citing benchmarks including AIME 2024, Codeforces, GPQA Diamond, MATH-500, MMLU and SWE-bench Verified.

Comparing DeepSeek, OpenAI on price

DeepSeek and OpenAI both disclose pricing for their models’ computations on their websites.

DeepSeek says R1 costs 55 cents per 1 million tokens of inputs — “tokens” referring to each individual unit of text processed by the model — and $2.19 per 1 million tokens of output.

In comparison, OpenAI’s pricing page for o1 shows the firm charges $15 per 1 million input tokens and $60 per 1 million output tokens. For GPT-4o mini, OpenAI’s smaller, low-cost language model, the firm charges 15 cents per 1 million input tokens.

Skepticism over chips

DeepSeek’s reveal of R1 has already led to heated public debate over the veracity of its claim — not least because its models were built despite export controls from the U.S. restricting the use of advanced AI chips to China.

DeepSeek claims it had its breakthrough using mature Nvidia clips, including H800 and A100 chips, which are less advanced than the chipmaker’s cutting-edge H100s, which can’t be exported to China.

However, in comments to CNBC last week, Scale AI CEO Alexandr Wang, said he believed DeepSeek used the banned chips — a claim that DeepSeek denies.

Nvidia has since come out and said that the GPUs that DeepSeek used were fully export-compliant.

The real deal or not?

Industry experts seem to broadly agree that what DeepSeek has achieved is impressive, although some have urged skepticism over some of the Chinese company’s claims.

“DeepSeek is legitimately impressive, but the level of hysteria is an indictment of so many,” U.S. entrepreneur Palmer Luckey, who founded Oculus and Anduril wrote on X.

“The $5M number is bogus. It is pushed by a Chinese hedge fund to slow investment in American AI startups, service their own shorts against American titans like Nvidia, and hide sanction evasion.”

Seena Rejal, chief commercial officer of NetMind, a London-headquartered startup that offers access to DeepSeek’s AI models via a distributed GPU network, said he saw no reason not to believe DeepSeek.

“Even if it’s off by a certain factor, it still is coming in as greatly efficient,” Rejal told CNBC in a phone interview earlier this week. “The logic of what they’ve explained is very sensible.”

However, some have claimed DeepSeek’s technology might not have been built from scratch.

“DeepSeek makes the same mistakes O1 makes, a strong indication the technology was ripped off,” billionaire investor Vinod Khosla said on X, without giving more details.

It’s a claim that OpenAI itself has alluded to, telling CNBC in a statement Wednesday that it is reviewing reports DeepSeek may have “inappropriately” used output data from its models to develop their AI model, a method referred to as “distillation.”

“We take aggressive, proactive countermeasures to protect our technology and will continue working closely with the U.S. government to protect the most capable models being built here,” an OpenAI spokesperson told CNBC.

Commoditization of AI

However the scrutiny surrounding DeepSeek shakes out, AI scientists broadly agree it marks a positive step for the industry.

Yann LeCun, chief AI scientist at Meta, said that DeepSeek’s success represented a victory for open-source AI models, not necessarily a win for China over the U.S. Meta is behind a popular open-source AI model called Llama.

“To people who see the performance of DeepSeek and think: ‘China is surpassing the US in AI.’ You are reading this wrong. The correct reading is: ‘Open source models are surpassing proprietary ones’,” he said in a post on LinkedIn.

“DeepSeek has profited from open research and open source (e.g. PyTorch and Llama from Meta). They came up with new ideas and built them on top of other people’s work. Because their work is published and open source, everyone can profit from it. That is the power of open research and open source.”

Meanwhile, Matt Calkins, CEO of U.S. software firm Appian, told CNBC that DeepSeek’s success simply shows that AI models are going to become more of a commodity in the future.

“In my opinion, we’re going to see a commoditization of AI. Many companies will achieve competitive AI, and a lack of differentiation will be bad for high-spending first-movers,” Calkins said via email.