DeepSeek Challenges ChatGPT and Google | en

The artificial intelligence landscape is witnessing intense competition, with Chinese AI startup DeepSeek rapidly emerging as a significant player. The company’s recent DeepSeek-R1-0528 update demonstrates its impressive capabilities, posing a serious challenge to competitors like OpenAI’s GPT-4o and Google’s Gemini.

Significant Performance Improvements

DeepSeek-R1-0528 has achieved notable performance improvements in areas such as complex reasoning, coding, and logic, which are often challenging for even the most advanced models. This release marks a new impetus for the AI field.

What sets DeepSeek apart is not only its technological advancements but also its open-source model and emphasis on lightweight training. These factors have combined to give DeepSeek an edge in speed and efficiency.

Leap in Benchmarking

In recent benchmarks, DeepSeek-R1-0528 achieved an accuracy of 87.5% in the AIME 2025 test, a significant improvement over the previous model’s 70%. In addition, its performance in the LiveCodeBench coding benchmark improved from 63.5% to 73.3%. Even more impressively, DeepSeek’s performance more than doubled in the notoriously difficult “Humanity’s Last Exam,” jumping from 8.5% to 17.7%.

These benchmark results strongly suggest that DeepSeek’s models can rival or even surpass Western competitors in specific areas.

Open Source Model and Convenient Construction

Unlike OpenAI and Google, DeepSeek has chosen an open path. R1-0528 is released under the MIT license, giving developers the freedom to use, modify, and deploy the model. This open stance has undoubtedly won DeepSeek wider support.

The update also adds support for JSON output and function calls, making it easy to build applications and tools that interact directly with the model.

This open model not only attracts researchers and developers but also makes DeepSeek an ideal choice for startups and businesses seeking alternatives to closed platforms.

Smarter, Not Harder, Training

One of the most impressive aspects of DeepSeek’s rise is the efficient way in which it builds its models. According to the company, the earlier version was trained in just 55 days on approximately 2,000 GPUs at a cost of $5.58 million, just a fraction of the training costs of comparable U.S. models.

This focus on resource-efficient training is a key differentiator, especially as the cost and carbon footprint of large language models continue to attract attention.

What It Means for the Future of AI

DeepSeek’s latest release is a sign of the dynamic changes in the AI world. With strong reasoning capabilities, transparent licensing, and faster development cycles, DeepSeek is positioning itself as a strong competitor to industry giants.

As the global AI landscape becomes more multipolar, models like the R1-0528 may play an important role in shaping the functionality, builders, controllers, and beneficiaries of AI.

Deep Dive into DeepSeek R1-0528: Technical Details and Innovation

The success of DeepSeek R1-0528 is not accidental. Behind it lies the DeepSeek team’s continuous technological innovation and the ultimate pursuit of details. In order to better understand its threat to ChatGPT and Google, we need to delve into its technical details and innovations.

Architecture Optimization and Improvements

DeepSeek R1-0528 has undergone significant architectural optimizations and improvements, resulting in significant performance and efficiency gains. The model adopts a variant of the Transformer architecture and makes customized adjustments for specific tasks.

Innovation in Attention Mechanisms: DeepSeek R1-0528 uses a more efficient attention mechanism to reduce computational complexity and increase the model’s reasoning speed. At the same time, the mechanism can also better capture long-distance dependencies, thereby improving the model’s ability to process complex text.

Streamlining Model Size: Although DeepSeek R1-0528 outperforms many large models in performance, its model size is relatively small. This benefits from the DeepSeek team’s efforts in model compression and knowledge distillation, enabling them to reduce the storage and computational costs of the model without sacrificing performance.

Dataset Construction and Processing

High-quality data is the cornerstone of training excellent artificial intelligence models. DeepSeek has invested a lot of energy in the construction and processing of datasets to ensure that the model can learn useful knowledge from rich and diverse data.

Multilingual Dataset: To improve the model’s versatility and cross-language capabilities, DeepSeek R1-0528 uses a multilingual dataset for training. The dataset contains text from different languages and fields, enabling the model to better understand and generate text in various languages.

Data Cleaning and Enhancement: The DeepSeek team rigorously cleaned and filtered the original data to remove noise and errors. At the same time, they also used data augmentation technology to expand the size of the dataset and improve the model’s generalization ability.

Optimization and Adjustment of Training Strategies

Training strategy is crucial to the performance of artificial intelligence models. DeepSeek has conducted numerous attempts and optimizations regarding training strategies, and finally found a training solution suitable for DeepSeek R1-0528.

Distributed Training: To accelerate the training speed, DeepSeek R1-0528 adopts a distributed training method. By distributing training tasks to multiple GPUs for parallel execution, the training time is greatly shortened.

Learning Rate Adjustment: The learning rate is one of the key parameters that affects the model training effect. The DeepSeek team dynamically adjusted the learning rate according to the model’s training situation to obtain better training results.

DeepSeek’s Open Source Strategy: An Engine for Accelerating AI Development

DeepSeek’s decision to open source its models is not just to attract the attention of developers and researchers, but a strategic decision. The open source strategy can accelerate the development of artificial intelligence and bring many benefits to DeepSeek.

Promoting Technological Innovation

Open source attracts developers and researchers from all over the world to participate in the improvement and optimization of models. This collective wisdom and strength can accelerate technological innovation and promote the progress of artificial intelligence.

Building an Ecosystem

Through open source, DeepSeek can build a vast ecosystem, attracting more developers and companies to develop various applications and services based on its models. This not only expands DeepSeek’s influence but also brings business opportunities to it.

Reducing Development Costs

Open source reduces development costs and reduces duplication of effort. Developers can directly use DeepSeek’s models without having to build from scratch, saving a lot of time and resources.

DeepSeek’s Challenges and Opportunities

Although DeepSeek has achieved significant achievements, its development path in the field of artificial intelligence is not smooth. DeepSeek faces many challenges and also has huge opportunities.

Challenges

Financial Pressure: The research and development and training of artificial intelligence models require a lot of financial investment. DeepSeek, as a startup, faces huge financial pressure.

Talent Competition: Talent competition in the field of artificial intelligence is very fierce. DeepSeek needs to attract and retain excellent talents to maintain its technological lead.

Technical Risks: Artificial intelligence technology is developing rapidly, and DeepSeek needs to continuously innovate to cope with new technical risks.

Opportunities

Market Demand: With the popularization of artificial intelligence technology, the market demand for artificial intelligence models is increasing. DeepSeek has huge market opportunities.

Policy Support: Governments around the world attach great importance to the development of artificial intelligence and have introduced a series of policy support measures. DeepSeek can benefit from this.

Technical Advantages: DeepSeek has certain technical advantages, especially in open source and efficient training. This lays a solid foundation for its future development.

Comparison of DeepSeek R1-0528 with Other Large Language Models

The table below provides a more detailed comparison of DeepSeek R1-0528 with OpenAI’s GPT-4o and Google’s Gemini 1.5 Pro on various benchmarks, as well as some key technical specifications.

Feature/Benchmark	DeepSeek R1-0528	OpenAI GPT-4o	Google Gemini 1.5 Pro
Benchmarks
AIME 2025	87.5%	Unknown	Unknown
LiveCodeBench	73.3%	Unknown	Unknown
Humanity’s Last Exam	17.7%	Unknown	Unknown
MMLU	High	High	High
Technical Specifications
Open Source License	MIT	Closed Source	Closed Source
JSON Output/Function Calling Support	Yes	Yes	Yes
Training Time	55 days	Unknown	Unknown
Training Cost	$5.58 Million	Unknown	Unknown
GPU Count	Approx. 2,000	Unknown	Unknown
Advantages and Disadvantages
Advantages	Open source, efficient training	Leading multimodal capabilities	Strong integration and ecosystem
Disadvantages	Relatively new entrant	Closed source, high cost	Closed source, potential price pressure

Influence of DeepSeek on the Future of AI Field

DeepSeek’s rise will have a profound impact on the future of the AI field. Below are some key predictions:

Popularization of Open Source AI Models: DeepSeek’s success may prompt more companies to choose the open source route, accelerating technological innovation and decentralization.
Formation of a Multipolar AI Landscape: DeepSeek’s emergence challenges the U.S.’s monopolistic position in the AI field and promotes a balance of global AI power.
More Efficient Training Methods: DeepSeek’s focus on resource efficiency may drive the AI industry to develop more efficient and environmentally friendly training methods.
Democratization of AI Technology: Through open source and lower costs, DeepSeek is making AI technology more accessible to developers and businesses, thereby promoting innovation and applications.

Code Examples for DeepSeek R1-0528

Below are some code examples for using DeepSeek R1-0528, showcasing its application in different scenarios.

Python Code Example: Using DeepSeek R1-0528 for text generation

updated at 2025-06-03

# LLM # AIGC # DeepSeek