Chinese AI Lab DeepSeek: What’s the big deal and the origin story?

The Origin Story

DeepSeek has completely exploded onto the global tech scene recently. This Chinese AI lab is backed by High-Flyer Capital Management, a big-shot quantitative hedge fund in China that uses AI to make smart trading decisions. The hedge fund’s co-founder, Liang Wenfeng, an AI enthusiast, got it started.

High-Flyer officially launched in 2019, but it wasn’t until 2023 that they spun off DeepSeek as its own lab, totally focused on AI research outside of the finance world. Right from the jump, DeepSeek built its own data centers to train its models.

You can learn more about their work on the official DeepSeek website. It’s interesting to note that, like other Chinese AI companies, DeepSeek has faced challenges due to U.S. export bans on hardware. For instance, they were forced to use the less powerful Nvidia H800 chips instead of the top-tier H100s available to U.S. competitors.

Also, DeepSeek is known for having a young technical team and aggressively recruiting top doctorate AI researchers from Chinese universities, but they also hire people without computer science backgrounds to make sure their AI understands a wide variety of subjects.

DeepSeek’s Powerful Models

DeepSeek first showed off its models—like DeepSeek Coder and DeepSeek LLM—in November 2023. But the AI world really started paying attention last spring when they released the DeepSeek-V2 family of models. This general-purpose system, which can analyze text and images, didn’t just perform super well in AI tests; it was also much cheaper to run than similar models at the time. This basically forced its competition, like ByteDance and Alibaba, to slash their own model prices.

DeepSeek followed up with DeepSeek V3 in December 2024, which they claim outperforms both open models like Meta’s Llama and closed ones like OpenAI’s GPT-4o. You can check out their model weights and documentation on platforms like Hugging Face, for example, the DeepSeek-V2 model page.

Even more impressive is their R1 “reasoning” model, released in January, which they claim performs as well as OpenAI’s o1. Reasoning models are great because they fact-check themselves, making them more reliable in complex subjects like science and math, even if they take a little longer (seconds to minutes) to get an answer. However, there’s a big catch: since they’re Chinese-developed, their models are scrutinized by China’s internet regulator to ensure they “embody core socialist values.” This means their chatbot, for example, won’t answer certain sensitive political questions.

Despite this, DeepSeek saw over 16.5 million visits in March, though this still pales in comparison to ChatGPT’s over 500 million weekly active users. In May, an updated R1 version was released, and in September, they unveiled an experimental model, V3.2-exp, designed for much lower costs in long-context use.

A Disruptive Approach & Global Response

DeepSeek’s business strategy is a bit mysterious. They price their products and services way below market value – and even give some away for free – claiming this is thanks to efficiency breakthroughs, though some experts question the figures.

Developers can access their models via the DeepSeek API Platform. They’re also reportedly not taking investor money, despite a lot of interest. What’s clear is that developers love their models, which are available under permissive licenses for commercial use.

The CEO of Hugging Face mentioned that developers have created over 500 “derivative” models of R1 with a combined 2.5 million downloads. This success has been described as everything from “upending AI” to “over-hyped.”

DeepSeek’s rise even contributed to a temporary 18% drop in Nvidia’s stock price in January and prompted a public response from OpenAI CEO Sam Altman.

The U.S. government is growing wary of what it sees as harmful foreign influence, leading the U.S. Commerce department to tell staffers that DeepSeek will be banned on government devices. South Korea and New York state have also banned it.

In March, OpenAI called DeepSeek “state-subsidized” and “state-controlled”, recommending the U.S. government ban their models. Microsoft, on the other hand, announced that DeepSeek is available on its Azure AI Foundry service. As for the future, better models are a given, but a global political pushback is a major question mark.

FAQ: DeepSeek at a Glance

Who owns DeepSeek? DeepSeek is an AI lab spun off from High-Flyer Capital Management, a Chinese quantitative hedge fund.
What makes their AI models special? Their models, especially the DeepSeek-V2 and DeepSeek V3 families, offer high performance at a much lower cost compared to competitors. The R1 “reasoning” model is also highly reliable for complex tasks like math and science.
What hardware constraints does DeepSeek face? Due to U.S. export bans, DeepSeek has been forced to use less powerful hardware like the Nvidia H800 chips instead of the top-tier H100s.
Why is DeepSeek controversial? Since it is Chinese-developed, the AI is subject to government review to ensure it aligns with “core socialist values,” leading it to avoid answering certain political questions. This has led to the company being banned on government devices in the U.S., South Korea, and New York state.
Is DeepSeek open source? No, but its models are available under permissive licenses that allow developers to use them for commercial purposes.
What is their business model? It’s unclear. They price their services extremely low or give them away for free, citing efficiency, and are reportedly not taking VC funding.
What are some key usage statistics? DeepSeek had over 16.5 million visits in March. Developers have created over 500 derivative models of R1 with 2.5 million combined downloads on Hugging Face.

The Origin Story

DeepSeek’s Powerful Models

A Disruptive Approach & Global Response

FAQ: DeepSeek at a Glance

Databricks’ secret sauce: Making Frontier AI 90x cheaper overnight

Inside the Mega-Yachts: A glimpse into the $100 million dollar floating palaces