Mistral Medium 3: Hype or Hope for European AI?

French AI startup Mistral AI recently unveiled its latest multimodal model, Mistral Medium 3, asserting that its performance approaches or even surpasses Anthropic’s Claude Sonnet 3.7, while costing less than China’s DeepSeek V3. This announcement has undoubtedly stirred significant excitement in the AI community, with many anticipating that this European-developed AI model could challenge the dominance of American companies in the AI sphere.

However, the ideal picture painted contrasts sharply with the reality encountered by many. Following the release of Mistral Medium 3, numerous media outlets and users conducted their own tests, and the results proved quite disappointing. The model, which had been met with such high expectations, fell far short of its advertised capabilities in practical applications. Some went so far as to describe its performance as “disappointing,” advising users not to “waste their time and resources downloading” it.

Mistral Medium 3: The Gap Between Hype and Reality

When Mistral AI launched Mistral Medium 3, they engaged in extensive marketing, asserting that it achieved over 90% of Claude Sonnet 3.7’s performance across various benchmark tests. They also highlighted its exceptional capabilities in specialized applications such as code generation and multimodal understanding. Furthermore, Mistral AI emphasized the cost advantages of Mistral Medium 3, stating that its input cost was just $0.4 per million tokens, and its output cost was $2, significantly lower than DeepSeek V3.

However, real-world testing revealed a noticeable disparity between the performance of Mistral Medium 3 and Claude Sonnet 3.7. In several evaluations, Mistral Medium 3 underperformed even some open-source models. For example, in assessments based on vocabulary classification tasks from The New York Times’ Connections puzzle, Mistral Medium 3 ranked at the bottom, virtually absent from the results.

Even more disheartening, some users discovered that Mistral Medium 3 did not significantly improve writing skills, with common issues such as unclear logic and disjointed expression persisting. Additionally, Mistral Medium 3 struggled to handle complex tasks, providing inadequate responses.

The Highlights of Mistral Medium 3

Despite its overall disappointing performance, Mistral Medium 3 is not without its merits. It demonstrates strengths in specific domains. For instance, its code generation capabilities are relatively stable, capable of producing concise and clear code, and it excels in simple coding tasks.

Mistral Medium 3 also offers enterprise-grade features, including support for hybrid cloud deployment, on-premises deployment, deployment within a VPC, customized post-training, and integration with enterprise tools and systems. These features enable Mistral Medium 3 to better meet the practical needs of businesses by providing more flexible and customizable AI solutions.

Mistral’s “Large” Plan: Mistral Large

Despite the underperformance of Mistral Medium 3, Mistral AI remains undeterred. Concurrent with the release of Mistral Medium 3, they revealed their development of a more powerful model named Mistral Large, asserting that its performance would far surpass Mistral Medium 3 and potentially exceed that of the most advanced AI models currently available.

This move by Mistral AI has generated renewed anticipation. If Mistral Large can truly achieve the performance levels claimed by Mistral AI, it has the potential to become a rising star in the AI field, injecting new vitality into European AI development.

Enterprise-Level Chatbot Service: Le Chat Enterprise

In addition to Mistral Medium 3 and Mistral Large, Mistral AI has launched an enterprise-grade chatbot service called Le Chat Enterprise. Powered by the Mistral Medium 3 model, Le Chat Enterprise aims to provide businesses with a unified AI platform to address their AI challenges, such as fragmented tools, insecure knowledge integration, inflexible models, and slow returns on investment.

Le Chat Enterprise offers an AI agent builder tool capable of integrating Mistral’s models with third-party services like Gmail, Google Drive, and SharePoint. Furthermore, Le Chat Enterprise will support the MCP protocol, a standard proposed by Anthropic for connecting AI with data systems and software.

User Testing: Mistral Medium 3 Underperforms

Despite Mistral AI’s extensive promotion of Mistral Medium 3, many users found its performance less impressive than advertised in their tests. Some even advised against downloading Mistral Medium 3 to avoid wasting bandwidth and hard drive space.

One user, “karminski-dentist,” reported that Mistral Medium 3’s performance was “disappointing,” advising users not to “waste their time and resources downloading” it. Another user noted that Mistral Medium 3 did not significantly improve writing skills and continued to exhibit common problems.

Media Reviews: Mixed Reactions to Mistral Medium 3

Similar to user feedback, media reviews of Mistral Medium 3 have been mixed. Some media outlets have acknowledged its strong performance in certain areas, such as code generation. However, others have found its overall performance disappointing, with a noticeable gap compared to Claude Sonnet 3.7.

For example, The Verge pointed out in a review article that Mistral Medium 3 struggled to handle complex tasks and provide satisfactory answers. TechCrunch stated in a review that its writing abilities were not significantly improved and that common problems persisted.

Limitations of Mistral Medium 3

In summary, the limitations of Mistral Medium 3 primarily include:

  • Insufficient Performance: The performance of Mistral Medium 3 is significantly behind that of Claude Sonnet 3.7, making it unsuitable for applications that require high performance.
  • Limited Writing Skills: The writing skills of Mistral Medium 3 have not been significantly improved, and common problems such as unclear logic and disjointed expression persist.
  • Inadequate Complex Task Handling: Mistral Medium 3 struggles to handle complex tasks and provide satisfactory answers.

Potential Applications of Mistral Medium 3

Despite its limitations, Mistral Medium 3 still has potential applications, such as:

  • Code Generation: Mistral Medium 3 exhibits relatively stable code generation capabilities and can be used to generate concise and clear code.
  • Enterprise-Level Applications: Mistral Medium 3 has enterprise-grade features such as support for hybrid cloud deployment, on-premises deployment, deployment within a VPC, customized post-training, and integration with enterprise tools and systems, making it suitable for meeting the practical needs of businesses.
  • Chatbots: Mistral Medium 3 can be used to power chatbots, providing users with intelligent conversational services.

Pricing Strategy of Mistral Medium 3

Mistral AI has adopted a low pricing strategy for Mistral Medium 3 to attract more users. The input cost is just $0.4 per million tokens, and the output cost is $2, significantly lower than DeepSeek V3.

The low pricing strategy enhances the competitiveness of Mistral Medium 3 and may allow it to gain market share.

Deployment Methods of Mistral Medium 3

Mistral Medium 3 supports various deployment methods, including:

  • API: The Mistral Medium 3 API is available on Mistral La Plateforme and Amazon Sagemaker and will soon be available on IBM WatsonX, NVIDIA NIM, Azure AI Foundry, and Google Cloud Vertex.
  • Self-Deployment: Mistral Medium 3 can be deployed on any cloud, including self-hosted environments with four or more GPUs.

The variety of deployment methods allows Mistral Medium 3 to better meet the needs of different users and provide more flexible and convenient deployment solutions.

Mistral Medium 3: Hope for European AI?

The release of Mistral Medium 3 has undoubtedly brought new hope to European AI. As a European AI startup, the rise of Mistral AI has the potential to challenge the dominance of American companies in the AI field and inject new vitality into European AI development.

However, the performance of Mistral Medium 3 has been disappointing, with a noticeable gap compared to Claude Sonnet 3.7. This indicates that European AI still needs continuous effort to truly catch up with the leading level of the United States in technology.

Mistral Large: Can it Bring Surprises?

Despite the underperformance of Mistral Medium 3, Mistral AI remains undeterred and continues to develop the more powerful Mistral Large. Whether Mistral Large can bring surprises and become a rising star in the AI field remains to be seen.

Conclusion

The release of Mistral Medium 3 has garnered widespread attention in the AI field, but its actual performance differs from its advertised capabilities. Although Mistral Medium 3 exhibits some strengths in specific areas, its overall performance needs improvement. The future development of Mistral AI, and whether Mistral Large can bring surprises, will be key focuses.

Summary

The release of Mistral Medium 3 is an important milestone in European AI development, but its performance also reminds us that European AI still needs continuous effort in technology. We look forward to Mistral Large bringing surprises and injecting new vitality into European AI development.