DeepSeek: What you need to know about the AI that dethroned ChatGPT

jtimes 2 weeks ago

0 0 5 minutes read

A year-old startup out of China is taking the AI industry by storm after releasing a chatbot which rivals the performance of ChatGPT while using a fraction of the power, cooling, and training expense of what OpenAI, Google, and Anthropic’s systems demand. Here’s everything you need to know about Deepseek’s V3 and R1 models and why the company could fundamentally upend America’s AI ambitions.

What is DeepSeek?

DeepSeek (technically, “Hangzhou DeepSeek Artificial Intelligence Basic Technology Research Co., Ltd.”) is a Chinese AI startup that was originally founded as an AI lab for its parent company, High-Flyer, in April, 2023. That May, DeepSeek was spun off into its own company (with High-Flyer remaining on as an investor) and also released its DeepSeek-V2 model. V2 offered performance on par with other leading Chinese AI firms, such as ByteDance, Tencent, and Baidu, but at a much lower operating cost.

The company followed up with the release of V3 in December 2024. V3 is a 671 billion-parameter model that reportedly took less than 2 months to train. What’s more, according to a recent analysis from Jeffries, DeepSeek’s “training cost of only US$5.6m (assuming $2/H800 hour rental cost). That is less than 10% of the cost of Meta’s Llama.” That’s a tiny fraction of the hundreds of millions to billions of dollars that US firms like Google, Microsoft, xAI, and OpenAI have spent training their models.

🚀 Introducing DeepSeek-V3!

Biggest leap forward yet:
⚡ 60 tokens/second (3x faster than V2!)
💪 Enhanced capabilities
🛠 API compatibility intact
🌍 Fully open-source models & papers

🐋 1/n pic.twitter.com/p1dV9gJ2Sd

— DeepSeek (@deepseek_ai) December 26, 2024

Benchmark tests put V3’s performance on par with GPT-4o and Claude 3.5 Sonnet. A December 2024 Op-Ed in The Hill categorized DeepSeek’s success as America’s “Sputnik Moment.”

DeepSeek released its R1-Lite-Preview model in November 2024, claiming that the new model could outperform OpenAI’s o1 family of reasoning models (and do so at a fraction of the price). The company estimates that the R1 model is between 20 and 50 times less expensive to run, depending on the task, than OpenAI’s o1. DeepSeek subsequently released DeepSeek-R1 and DeepSeek-R1-Zero in January 2025. The R1 model, unlike its o1 rival, is open source, which means that any developer can use it.

As such V3 and R1 have exploded in popularity since their release, with DeepSeek’s V3-powered AI Assistant displacing ChatGPT at the top of the app stores. Venture capitalist Marc Andreesen, in a recent social media post, called DeepSeek’s chatbot “one of the most amazing and impressive breakthroughs I’ve ever seen” and a “profound gift to the world.”

What can DeepSeek do?

As an open-source large language model, DeepSeek’s chatbots can do essentially everything that ChatGPT, Gemini, and Claude can. That includes text, audio, image, and video generation. What’s more, DeepSeek’s newly released family of multimodal models, dubbed Janus Pro, reportedly outperforms DALL-E 3 as well as PixArt-alpha, Emu3-Gen, and Stable Diffusion XL, on a pair of industry benchmarks. DeepSeek-R1, rivaling o1, is specifically designed to perform complex reasoning tasks, while generating step-by-step solutions to problems and establishing “logical chains of thought,” where it explains its reasoning process step-by-step when solving a problem.

oh boy #deepseek

— Alexios Mantzarlis (@mantzarlis.com) 2025-01-27T16:50:40.640Z

What DeepSeek’s products can’t do is talk about Tienanmen Square. Or the Yellow Umbrella protests. Or President Xi Jinping’s likeness to Winnie the Pooh. Basically, if it’s a subject considered verboten by the Chinese Communist Party, DeepSeek’s chatbots will not address it or engage in any meaningful way.

Who can use DeepSeek?

Andrew Tarantola / DeepSeek / Digital Trends

As an open-source LLM, DeepSeek’s model can be used by any developer for free. OpenAI charges $200 per month for the Pro subscription needed to access o1. DeepSeek’s models are available on the web, through the company’s API, and via mobile apps. You will need to sign up for a free account at the DeepSeek website in order to use it, however the company has temporarily paused new sign ups in response to “large-scale malicious attacks on DeepSeek’s services.” Existing users can sign in and use the platform as normal, but there’s no word yet on when new users will be able to try DeepSeek for themselves.

Why is DeepSeek suddenly such a big deal?

Since the release of ChatGPT in November 2023, American AI companies have been laser-focused on building bigger, more powerful, more expansive, more power and resource-intensive large language models. Rather than seek to build more cost-effective and energy-efficient LLMs, companies like OpenAI, Microsoft, Anthropic, and Google instead saw fit to simply brute force the technology’s advancement by, in the American tradition, simply throwing absurd amounts of money and resources at the problem. In 2024 alone, xAI CEO Elon Musk was expected to personally spend upwards of $10 billion on AI initiatives. OpenAI and its partners just announced a $500 billion Project Stargate initiative that would drastically accelerate the construction of green energy utilities and AI data centers across the US. Google plans to prioritize scaling the Gemini platform throughout 2025, according to CEO Sundar Pichai, and is expected to spend billions this year in pursuit of that goal. Meta announced in mid-January that it would spend as much as $65 billion this year on AI development.

DeepSeek just showed the world that none of that is actually necessary — that the “AI Boom” which has been helping spur the American economy in recent months and which has made GPU companies like Nvidia exponentially more wealthy than they were in October 2023, may be nothing more than a sham. It also calls into question just how much of a lead the US actually has in AI, despite repeatedly banning shipments of leading-edge GPUs to China over the past year.

“The bottom line is the US outperformance has been driven by tech and the lead that US companies have in AI,” Keith Lerner, an analyst at Truist, told CNN. “The DeepSeek model rollout is leading investors to question the lead that US companies have and how much is being spent and whether that spending will lead to profits (or overspending).”

In short, DeepSeek just beat the American AI industry at its own game, showing that the current mantra of “growth at all costs” is no longer valid. “DeepSeek clearly doesn’t have access to as much compute as U.S. hyperscalers and somehow managed to develop a model that appears highly competitive,” Srini Pajjuri, semiconductor analyst at Raymond James, told CNBC. If a Chinese startup can build an AI model that works just as well as OpenAI’s latest and greatest, and do so in under two months and for less than $6 million, then what use is Sam Altman anymore?

“Time will tell if the DeepSeek threat is real — the race is on as to what technology works and how the big Western players will respond and evolve,” Michael Block, market strategist at Third Seven Capital, told CNN. “Markets had gotten too complacent on the beginning of the Trump 2.0 era and may have been looking for an excuse to pull back — and they got a great one here.”