A Chinese AI laboratory, DeepSeek, suddenly entered the global spotlight this week, with its chatbot application topping the Apple App Store and Google Play download charts. The company's AI models, trained using computationally efficient techniques, have raised questions among Wall Street analysts and the tech industry about whether the United States can maintain its leadership in AI and the sustainability of demand for AI chips.
Behind DeepSeek is China's quantitative hedge fund, High-Flyer Capital Management. The fund uses AI technology to assist in trading decisions and was founded in 2015 by Liang Wenfeng, an AI enthusiast. According to reports, Liang began his trading career while studying at Zhejiang University, and in 2019 transformed High-Flyer into a hedge fund focused on developing and deploying AI algorithms.
In 2023, High-Flyer launched the DeepSeek project as an independent AI tool research laboratory separate from its financial business. Subsequently, under High-Flyer's investment, the laboratory was spun off into the independent company DeepSeek.
From its inception, DeepSeek has built its own data center clusters for model training. However, like other Chinese AI companies, DeepSeek has been affected by U.S. hardware export restrictions. When training its latest models, the company had to use NVIDIA H800 chips, a reduced version of the H100 chips available to U.S. companies, which are less powerful.
It is understood that DeepSeek's technical team is relatively young, and the company actively recruits AI PhD researchers from top Chinese universities. According to The New York Times, DeepSeek also employs people without a computer science background to help the technical team better understand a wide range of academic fields.
DeepSeek released its first set of models, DeepSeek Coder, DeepSeek LLM, and DeepSeek Chat, in November 2023. It was not until the release of the new DeepSeek-V2 series models in the spring of last year that the AI industry began to take the company seriously.
DeepSeek-V2 is a general-purpose text and image analysis system that performs well in multiple AI benchmark tests and runs at a much lower cost than similar models at the time. This forced domestic competitors such as ByteDance and Alibaba to cut the prices of some models or even make certain models completely free.
The release of DeepSeek-V3 in December 2024 further increased the company's visibility. According to internal benchmark tests, DeepSeek V3's performance surpassed open-source models like Meta's Llama and closed models like OpenAI's GPT-4o, which are only accessible via API.
Similarly impressive is DeepSeek's R1 inference model. The model was released in January of this year, and DeepSeek claims it performs comparably to OpenAI's o1 model in key benchmark tests.
As an inference model, R1 is able to effectively self-validate, helping to avoid some common errors. Inference models usually take several seconds to minutes to arrive at a solution, but they offer the advantage of being more reliable in areas such as physics, science, and mathematics.
However, DeepSeek's models also have limitations. As an AI system developed in China, these models must undergo baseline testing by Chinese internet regulatory authorities to ensure responses "reflect socialist core values." In DeepSeek's chat application, R1 will not answer sensitive questions about Tiananmen Square or Taiwan autonomy.
DeepSeek's traffic exceeded 16.5 million visits in March. Similarweb editor David Carr told TechCrunch: "DeepSeek ranked second in March, although daily visits dropped by 25% compared to February." But this is still far below ChatGPT, which had over 500 million weekly active users in March.
In May, DeepSeek released an updated version of the R1 inference model on the developer platform Hugging Face. In September, the company launched an experimental model called V3.2-exp, aimed at significantly reducing inference costs in long context operations.
If DeepSeek has a business model, it is currently unclear what it is. The company's product and service pricing is far below market prices, and some services are even provided free of charge. Despite the interest of venture capital institutions, the company has not accepted external investments.
DeepSeek claims that its efficiency breakthroughs allow it to maintain extreme cost competitiveness, but some experts remain skeptical about the data provided by the company.
Regardless, developers have widely adopted DeepSeek's models. These models are not traditionally open source, but they use a permissive license that allows commercial use. Clem Delangue, CEO of Hugging Face, said that developers on the platform have created over 500 R1 derivative models with a total of 2.5 million downloads.
DeepSeek's success against larger and more mature competitors has been described as "disrupting the AI industry" and "overhyped." At least partially, the company's success led to a 18% drop in NVIDIA's stock price in January and prompted an open response from OpenAI's CEO, Sam Altman. In March, according to Reuters, the U.S. Department of Commerce notified its employees that DeepSeek would be banned on government devices.
Microsoft announced that it would offer DeepSeek on its Azure AI Foundry service. When asked about the impact of DeepSeek on Meta AI spending during the first-quarter earnings call, CEO Zuckerberg stated that AI infrastructure spending will continue to be Meta's "strategic advantage." In March, OpenAI claimed that DeepSeek was "state-subsidized" and "state-controlled," and advised the U.S. government to consider banning DeepSeek models.
During the fourth-quarter earnings call, NVIDIA's CEO Jensen Huang emphasized DeepSeek's "excellent innovation," stating that the company and other inference models benefit NVIDIA because they require more computing resources.
Meanwhile, some companies and countries are banning DeepSeek, including South Korea. New York State has also banned the use of DeepSeek on government devices. In May, Microsoft's vice chairman and president, Brad Smith, stated at a Senate hearing that Microsoft employees are not allowed to use DeepSeek due to concerns about data security and propaganda.
As for DeepSeek's future direction, it remains unclear. Improving models is inevitable, but the U.S. government seems increasingly vigilant about what it perceives as harmful foreign influences. In March, The Wall Street Journal reported that the U.S. might ban DeepSeek on government devices.