DeepSeek: The Rise of a Chinese AI Powerhouse | en

DeepSeek, a name that has rapidly ascended from relative obscurity to a focal point in the global AI conversation, has sparked intense debate and speculation within the technology and financial sectors. The Chinese AI lab behind this burgeoning force has disrupted the established order, prompting analysts to question the sustainability of U.S. dominance in the AI race and the long-term viability of the current AI chip demand. But what are the key factors that have propelled DeepSeek to its current prominence?

The Genesis of DeepSeek: From Hedge Fund to AI Lab

DeepSeek’s origins are deeply intertwined with the world of quantitative finance. It is backed by High-Flyer Capital Management, a Chinese hedge fund renowned for its utilization of AI in making data-driven trading decisions. This unusual origin story provides a unique perspective on the AI landscape, where financial expertise merges with cutting-edge technology. The hedge fund’s reliance on AI for trading strategies created a natural breeding ground for the development of more advanced AI capabilities.

Liang Wenfeng, an AI enthusiast with a background in trading during his time at Zhejiang University, co-founded High-Flyer in 2015. In 2019, he launched High-Flyer Capital Management as a hedge fund with a specific focus on developing and implementing AI algorithms for financial applications. This strategic focus on AI from the outset demonstrates a clear vision and commitment to leveraging the power of artificial intelligence in the financial sector. The early adoption of AI allowed High-Flyer to gain a competitive edge and accumulate the resources necessary to invest in more ambitious AI projects.

In 2023, High-Flyer incubated DeepSeek as a dedicated AI research lab, operating independently from its core financial business. Subsequently, with High-Flyer as a key investor, the lab was spun off into a separate entity, retaining the name DeepSeek. This strategic separation allowed DeepSeek to focus entirely on AI research and development, free from the constraints and priorities of the financial industry. The spin-off also provided DeepSeek with the flexibility to attract talent and pursue opportunities that might not have been available within a hedge fund structure.

From its inception, DeepSeek prioritized the establishment of its own data center clusters to facilitate model training. This early investment in infrastructure underscores the importance of computational power in AI development. Building its own data centers allowed DeepSeek to maintain control over its resources and optimize its infrastructure for the specific needs of its AI models. This proactive approach to infrastructure development gave DeepSeek a significant advantage over competitors who rely on external cloud providers.

However, similar to other AI companies operating in China, DeepSeek has encountered challenges due to U.S. export restrictions on advanced hardware. This geopolitical reality has forced DeepSeek to adapt and find creative solutions to access the necessary computing power. The restrictions highlight the complex interplay between technology, trade, and national security in the AI era.

Consequently, to train its more recent models, the company had to resort to using Nvidia H800 chips, a less powerful variant of the H100 chips that are readily available to U.S. companies. This limitation has not prevented DeepSeek from making significant progress, but it underscores the potential impact of export controls on the pace of AI development in China. The ability to innovate and achieve impressive results despite these limitations demonstrates the resilience and ingenuity of the DeepSeek team.

DeepSeek’s technical team is known for its youthfulness and dynamism. The company actively recruits doctoral AI researchers from leading Chinese universities. This focus on attracting top talent is crucial for driving innovation and maintaining a competitive edge in the rapidly evolving AI landscape. The influx of young, ambitious researchers brings fresh perspectives and cutting-edge expertise to the company.

Furthermore, DeepSeek employs individuals from diverse backgrounds, even those without computer science expertise, to ensure that its technology can effectively understand and cater to a broad range of subjects, as reported by The New York Times. This emphasis on diversity and interdisciplinary collaboration is essential for developing AI models that are truly useful and relevant to a wide range of users. By incorporating perspectives from different fields, DeepSeek can avoid biases and create AI systems that are more inclusive and adaptable.

DeepSeek’s AI Models: Challenging the Status Quo

DeepSeek unveiled its initial suite of models – DeepSeek Coder, DeepSeek LLM, and DeepSeek Chat – in November 2023. These initial models laid the foundation for DeepSeek’s future success and demonstrated the company’s commitment to developing a comprehensive range of AI capabilities. While these models were not as groundbreaking as DeepSeek’s later releases, they served as valuable learning experiences and helped the company refine its development process.

However, it was the release of its next-generation DeepSeek-V2 family of models in the spring that truly captured the attention of the AI industry. DeepSeek-V2 represented a significant leap forward in performance and capabilities, showcasing the company’s rapid progress in AI research and development. The release of DeepSeek-V2 marked a turning point for the company, establishing it as a serious contender in the global AI arena.

DeepSeek-V2, a versatile system capable of analyzing both text and images, demonstrated impressive performance across various AI benchmarks. This multimodal capability is becoming increasingly important in AI, as it allows models to understand and interact with the world in a more comprehensive way. The ability to process both text and images opens up new possibilities for AI applications in areas such as image recognition, video analysis, and natural language understanding.

Notably, it achieved this performance at a significantly lower cost compared to competing models available at the time. This cost-effectiveness is a key differentiator for DeepSeek, making its models more accessible to a wider range of users and organizations. The lower cost is likely due to a combination of factors, including efficient model architecture, optimized training techniques, and access to affordable computing resources.

This prompted DeepSeek’s domestic rivals, including ByteDance and Alibaba, to reduce the prices of some of their models and offer others completely free. This competitive response highlights the disruptive impact of DeepSeek on the Chinese AI market. The pressure to lower prices and offer free services demonstrates the intense competition in the AI space and the importance of staying ahead of the curve.

DeepSeek V3 has showcased superior performance compared to both downloadable, open-source models like Meta’s Llama and “closed” models accessible only through APIs, such as OpenAI’s GPT-4o. This achievement further solidifies DeepSeek’s position as a leader in AI technology. The ability to outperform both open-source and closed-source models is a testament to the quality of DeepSeek’s research and development efforts.

Equally noteworthy is DeepSeek’s R1 “reasoning” model. Launched in January, DeepSeek asserts that R1 achieves comparable performance to OpenAI’s o1 model on key benchmarks. The development of a strong reasoning model is a significant step forward in AI, as it allows models to solve more complex problems and make more informed decisions.

As a reasoning model, R1 incorporates self-checking mechanisms, mitigating some of the common pitfalls associated with standard models. This self-checking capability helps to improve the accuracy and reliability of the model’s outputs. By identifying and correcting errors, R1 can produce more consistent and trustworthy results.

While reasoning models may require slightly longer processing times to arrive at solutions (ranging from seconds to minutes), they tend to exhibit greater reliability in domains such as physics, science, and mathematics. This trade-off between speed and accuracy is often necessary in complex reasoning tasks. The increased reliability makes reasoning models particularly valuable in applications where accuracy is paramount.

However, DeepSeek’s models, including R1 and DeepSeek V3, are subject to oversight by China’s internet regulator, which ensures that their responses align with “core socialist values.” This regulatory oversight is a unique aspect of the Chinese AI landscape and has implications for the content and functionality of AI models. The need to comply with government regulations can potentially limit the range of topics that AI models can address and the types of opinions they can express.

For instance, in DeepSeek’s chatbot app, R1 will not address questions pertaining to Tiananmen Square or Taiwan’s autonomy. This censorship highlights the challenges of developing AI in a highly regulated environment. The limitations on content can potentially impact the usefulness and appeal of AI models to users who are interested in exploring sensitive or controversial topics.

In March, DeepSeek’s website traffic exceeded 16.5 million visits. This impressive level of traffic indicates a growing interest in DeepSeek’s products and services. The website serves as a key platform for showcasing the company’s capabilities and attracting new users.

Despite a 25% decrease in traffic compared to February, DeepSeek ranked second in terms of daily visits, according to David Carr, editor at Similarweb. This ranking demonstrates DeepSeek’s strong position in the AI market, even in comparison to other major players. The slight decrease in traffic may be due to seasonal factors or increased competition from other AI providers.

However, this figure still pales in comparison to ChatGPT, which surpassed 500 million weekly active users in March. This comparison highlights the significant gap in market share between DeepSeek and the leading AI chatbot. While DeepSeek has made impressive progress, it still has a long way to go to catch up with the established giants of the AI industry.

A Disruptive Approach to the AI Landscape

DeepSeek’s business model remains somewhat enigmatic. The company prices its products and services significantly below market value, and even offers some for free. This aggressive pricing strategy is designed to attract users and gain market share quickly. The willingness to offer free services demonstrates a commitment to democratizing access to AI technology.

Furthermore, it has resisted external funding despite substantial interest from venture capital firms. This unusual decision suggests that DeepSeek is prioritizing long-term growth and independence over short-term financial gains. The ability to resist external funding also allows DeepSeek to maintain control over its strategic direction and avoid the pressures of meeting investor expectations.

DeepSeek attributes its extreme cost competitiveness to breakthroughs in efficiency. This explanation suggests that DeepSeek has made significant advancements in optimizing its AI models and infrastructure. The focus on efficiency is likely driven by the need to compete in a market where resources are constrained.

However, some experts have questioned the accuracy of the figures provided by the company. These concerns highlight the challenges of verifying the claims made by AI companies and the need for greater transparency in the industry. The skepticism surrounding DeepSeek’s cost competitiveness may be due to a lack of detailed information about its operations and financial performance.

Regardless, developers have embraced DeepSeek’s models, which, while not open source in the traditional sense, are available under permissive licenses that allow for commercial use. This accessibility has contributed to the rapid adoption of DeepSeek’s models by developers around the world. The permissive licenses make it easier for developers to integrate DeepSeek’s technology into their own projects and create new applications.

According to Clem Delangue, CEO of Hugging Face, developers on the platform have created over 500 derivative models of R1, accumulating a combined total of 2.5 million downloads. This widespread adoption is a testament to the popularity and usefulness of DeepSeek’s models. The large number of derivative models demonstrates the creative potential of AI technology and the willingness of developers to build upon existing foundations.

DeepSeek’s success against larger, more established competitors has been described as both “upending AI” and “over-hyped.” These contrasting views reflect the ongoing debate about the true impact and potential of DeepSeek. While some believe that DeepSeek is a game-changer that will revolutionize the AI industry, others are more skeptical and argue that its achievements have been exaggerated.

The company’s achievements were partly responsible for an 18% drop in Nvidia’s stock price in January, and prompted a public response from OpenAI CEO Sam Altman. These reactions highlight the significant attention that DeepSeek has garnered within the AI community. The impact on Nvidia’s stock price demonstrates the potential for DeepSeek to disrupt the established order in the AI hardware market.

In March, U.S. Commerce Department bureaus reportedly banned DeepSeek on government devices, according to Reuters. This ban reflects concerns about data security and potential national security risks. The decision to ban DeepSeek on government devices underscores the growing awareness of the potential threats associated with AI technology.

Microsoft has integrated DeepSeek into its Azure AI Foundry service, a platform that consolidates AI services for enterprises. This partnership with Microsoft is a major validation of DeepSeek’s technology and provides the company with access to a vast network of customers. The integration into Azure AI Foundry allows enterprises to easily access and utilize DeepSeek’s models for a variety of applications.

During Meta’s first-quarter earnings call, CEO Mark Zuckerberg stated that investments in AI infrastructure would continue to be a “strategic advantage” for the company, when asked about DeepSeek’s potential impact on Meta’s AI spending. This statement suggests that Meta is taking DeepSeek seriously as a competitor and is prepared to invest heavily in AI to maintain its competitive edge. The recognition from a major player like Meta further solidifies DeepSeek’s position in the AI landscape.

In March, OpenAI labeled DeepSeek as “state-subsidized” and “state-controlled,” recommending that the U.S. government consider banning its models. This accusation highlights the geopolitical tensions surrounding AI technology and the concerns about foreign influence. The call for a ban on DeepSeek’s models reflects the growing awareness of the potential risks associated with AI developed by foreign governments.

During Nvidia’s fourth-quarter earnings call, CEO Jensen Huang highlighted DeepSeek’s “excellent innovation,” noting that its reasoning models require significantly more computing power, benefiting Nvidia. This acknowledgment from Nvidia’s CEO underscores the importance of DeepSeek as a customer and highlights the potential for its technology to drive demand for advanced AI hardware. The need for more computing power to run DeepSeek’s reasoning models is a positive sign for Nvidia, as it indicates a growing market for its products.

Conversely, some companies, countries, and governments, including South Korea and New York state, have banned the use of DeepSeek on government devices. These bans reflect concerns about data security, privacy, and potential national security risks. The growing number of bans highlights the challenges that DeepSeek faces in gaining acceptance and trust in the global market.

In May, Microsoft Vice Chairman and President Brad Smith testified before the Senate that Microsoft employees are prohibited from using DeepSeek due to concerns about data security and potential propaganda. This testimony further underscores the concerns about the potential risks associated with DeepSeek’s technology. The decision by Microsoft to prohibit its employees from using DeepSeek reflects a cautious approach to managing data security and national security risks.

The Uncertain Future of DeepSeek

The future trajectory of DeepSeek remains uncertain. While further model improvements are anticipated, the U.S. government appears increasingly wary of perceived harmful foreign influence. The geopolitical tensions surrounding AI technology are likely to continue to shape the future of DeepSeek.

In March, The Wall Street Journal reported that the U.S. is likely to ban DeepSeek on government devices. This potential ban would further restrict DeepSeek’s access to the U.S. market and limit its ability to compete with domestic AI providers. The increasing scrutiny from the U.S. government poses a significant challenge to DeepSeek’s long-term growth prospects.

DeepSeek’s rapid ascent has undeniably shaken the foundations of the AI industry, prompting a reassessment of competitive dynamics and the potential for disruptive innovation. The company’s success has forced established players to rethink their strategies and adapt to the changing landscape. DeepSeek has demonstrated that new players can emerge quickly and challenge the dominance of established companies in the AI market.

Whether it can sustain its current momentum in the face of increasing scrutiny and regulatory challenges remains to be seen. The coming years will be pivotal in determining DeepSeek’s long-term impact on the global AI landscape. The company’s ability to navigate the complex interplay of technological advancement, geopolitical considerations, and ethical concerns will ultimately define its legacy.

The AI world will be watching closely. DeepSeek’s journey is a fascinating case study in the dynamics of the AI industry. The company’s success, challenges, and future prospects will be closely monitored by researchers, investors, and policymakers around the world.

The DeepSeek story is a reminder that in the rapidly evolving world of artificial intelligence, new players can emerge quickly and challenge the established order. The company’s success, driven by innovative technology and a willingness to disrupt traditional business models, has forced the industry to take notice. As DeepSeek continues to develop and expand its reach, it will undoubtedly play a significant role in shaping the future of AI. Its impact on the AI landscape will be felt for years to come.

updated at 2025-05-10

# LLM # AIGC # DeepSeek