DeepSeek AI: Less Chip, More Sustainability?

AI tools’ carbon footprint has been a growing concern, and DeepSeek AI’s claim that its models are more efficient compared to others has undoubtedly made waves in the industry. A recent study by Greenly, a French sustainable software company, sought to validate DeepSeek’s claims.

Greenly’s research suggests DeepSeek models require less training time and fewer Nvidia chips during training. When training DeepSeek’s V3 model and Meta’s Llama 3.1 model in the same scenario, DeepSeek used 2.78 million graphics processing unit (GPU) hours, while Meta’s model used 30.8 million GPU hours. As training is often the most carbon-intensive aspect of AI model operation, DeepSeek’s faster training speeds contribute to its efficiency. Furthermore, DeepSeek used 2,000 Nvidia chips, whereas Meta’s model used over 16,000, and ChatGPT used over 25,000; DeepSeek’s chips also boast a lower "energy intensity" than those used by ChatGPT.

Greenly’s research stated: "Due to US-imposed sanctions limiting DeepSeek’s access to Nvidia’s most advanced AI chips, the company has had to develop these innovative techniques. This restriction forces DeepSeek to design models that maximize efficiency, rather than relying on massive computing power."

DeepSeek’s Technological Innovation: Mixture-of-Experts

DeepSeek’s design model includes its mixture-of-experts design, which enables the tool to delegate user tasks to sub-models, "activating only the computing power needed for a given request." This approach is analogous to a large team, where each member is an expert in a specific area. When a new problem arises, the team leader assigns it to the most qualified expert, rather than involving the entire team.

In DeepSeek’s mixture-of-experts model, a large AI model is broken down into smaller, more specialized sub-models. Each sub-model is trained to excel at a particular type of task. For example, one sub-model might specialize in natural language processing tasks, while another might specialize in image recognition tasks.

When a user makes a request to DeepSeek AI, the system analyzes the request and determines which sub-model is best suited to handle it. The system then routes the request to the appropriate sub-model, which processes the request and returns the results.

This approach offers several advantages:

  • Increased Efficiency: By activating only the computing power needed for a given request, mixture-of-experts models can significantly enhance efficiency. This saves computational resources compared to traditional AI models that require the entire model to be activated.
  • Improved Accuracy: By delegating tasks to the sub-model best equipped to handle them, mixture-of-experts models can increase accuracy. Each sub-model is specially trained to excel in its specific domain, making it more likely to produce accurate results.
  • Enhanced Scalability: Mixture-of-experts models are more easily scalable, as new sub-models can be added as needed to handle new tasks. This allows the system to adapt to changing requirements.

DeepSeek’s Relationship with Data Centers: A Key Factor for Sustainability

Greenly’s research also noted that DeepSeek’s relationship (or potential lack thereof) with data centers also contributes to its sustainability. Because DeepSeek is an open-weight model, or publicly available, Greenly pointed out that it can run on physical devices rather than exclusively on cloud computing or through data centers. By decreasing the demand for data centers, DeepSeek can in turn reduce the facilities’ energy consumption, which is expected to double in five years.

Data centers are large buildings containing a multitude of computer servers and other equipment. These servers are used for storing, processing, and distributing data. Data centers require immense amounts of energy to operate because the servers generate significant heat, which needs to be dissipated through cooling systems.

By reducing the need for data centers, DeepSeek can help lower global energy consumption and carbon emissions. This is crucial for addressing climate change.

Jevons Paradox: The Potential Risk of Efficiency Gains

Nonetheless, Greenly’s research also cautioned that "these gains could easily be short-lived," attributing this to the Jevons paradox, or the phenomenon that the more efficient something is, the more it will be used, leading to higher overall emissions.

The Jevons paradox was proposed by British economist William Stanley Jevons in the 19th century. Jevons observed that as the efficiency of coal-burning improved, coal consumption did not decrease; instead, it increased. He argued that this was because increased efficiency lowered the price of coal, thereby stimulating more demand.

In the context of AI, the Jevons paradox implies that even if AI models like DeepSeek become more efficient, overall carbon emissions may still rise due to the widespread adoption and use of AI. For example, if AI becomes more efficient, companies might be more inclined to use AI to automate more tasks, leading to an exponential increase in AI usage. This growth could potentially negate the benefits of efficiency gains, or even lead to increased carbon emissions.

Responsible AI Deployment: Ensuring Sustainability Is Key

To avoid the Jevons paradox, Greenly’s research emphasized the importance of "responsible deployment." This means that businesses and individuals should take steps to reduce their carbon footprint when using AI. Here are some measures that can be taken:

  • Use efficient AI models: Choosing efficient AI models like DeepSeek can reduce energy consumption and carbon emissions.
  • Optimize AI model usage: Ensure that AI models only run when necessary and avoid excessive use.
  • Use renewable energy: Powering data centers and physical devices with renewable energy can reduce carbon emissions.
  • Support sustainable AI development: Support companies and organizations committed to developing and deploying sustainable AI technologies.

By taking these measures, we can ensure that the benefits of AI do not come at the expense of the environment.

DeepSeek AI’s Open-Source Strategy: Accelerating Innovation and Sustainable Development

DeepSeek AI’s decision to open-source some of its models not only accelerates innovation in AI technology but also contributes to its sustainable development to some extent. Open source means that anyone can access, use, modify, and distribute DeepSeek AI’s model code. This openness brings advantages in several areas:

  • Accelerated Innovation: By being open source, DeepSeek AI is poised to attract more developers into the improvement and optimization of the model. Developers from around the world can work together to discover flaws in the model and propose new solutions. This open collaborative model can accelerate AI innovation and drive AI applications in various fields.
  • Reduced Development Costs: For other businesses and research institutions, using DeepSeek AI’s open-source models can significantly reduce AI development costs. They don’t need to build their own models from scratch; instead, they can modify and customize their own based on DeepSeek AI’s models, saving significant time and resources.
  • Improved Model Accessibility: Open source makes DeepSeek AI’s model more accessible. This helps promote the popularization of AI technology, allowing more people to benefit from it.
  • Promotion of Sustainable Development: By being open source, more developers can understand DeepSeek AI’s efforts to improve model efficiency. This helps promote sustainable AI development concepts, encouraging more developers to pay attention to the environmental impact of AI and develop more efficient and environmentally friendly AI models.

However, open source also presents some challenges. For example, the security of open-source models is a major concern. If there are vulnerabilities in the model, they may be exploited by malicious attackers. Furthermore, intellectual property protection of open-source models is also a matter of concern.

Despite some challenges, DeepSeek AI’s open-source strategy is generally beneficial. It accelerates AI innovation, reduces AI development costs, improves model accessibility, and promotes sustainable AI development.

DeepSeek AI’s Application Potential in Different Industries

DeepSeek AI’s efficiency and sustainability give it broad application potential in various industries. Here are some areas where DeepSeek AI may play an important role:

  • Natural Language Processing (NLP): DeepSeek AI can be used to build more efficient and accurate NLP models, thereby improving machine translation, text summarization, sentiment analysis, and other applications.
  • Computer Vision: DeepSeek AI can be used to build more efficient and accurate computer vision models, improving image recognition, object detection, video analysis, and other applications.
  • Recommendation Systems: DeepSeek AI can be used to build more efficient and personalized recommendation systems, thereby improving user experience and business benefits.
  • Healthcare: DeepSeek AI can be used to assist in diagnosis, drug development, personalized treatment, and other fields, thereby improving medical efficiency and improving patient prognosis.
  • Financial Services: DeepSeek AI can be used for risk assessment, fraud detection, quantitative trading, and other fields, thereby improving the efficiency and security of financial services.
  • Manufacturing: DeepSeek AI can be used for production process optimization, quality control, fault prediction, and other fields, thereby improving production efficiency and reducing production costs.

DeepSeek AI’s case demonstrates that future AI development will increasingly emphasize efficiency, sustainability, and responsible deployment. As AI technology continues to develop, we need to pay more attention to the environmental and social impact of AI and take measures to ensure that the benefits of AI are fully realized while minimizing its negative impacts.

Here are some future trends in AI development:

  • Model Compression and Optimization: Researchers will continue to explore new methods to compress and optimize AI models, thereby reducing the computational needs and energy consumption of the models.
  • Edge Computing: Deploying AI models on edge devices (such as smartphones, sensors, etc.) can reduce reliance on data centers, thereby reducing energy consumption and latency.
  • Green AI: More and more researchers will focus on the development of green AI, that is, the development of more environmentally friendly and sustainable AI technologies.
  • AI Ethics and Security: AI ethics and security issues will be increasingly emphasized. We need to develop corresponding policies and regulations to ensure the safety, reliability, and fairness of AI.

DeepSeek AI’s exploration provides a good example of how to focus on sustainable AI development while improving AI efficiency. In the future, we expect to see more innovative companies like DeepSeek AI contribute to building a greener, more sustainable AI ecosystem.