The Dawn of MCP and A2A: A Paradigm Shift
The emergence of Model Context Protocol (MCP) and Agent2Agent (A2A) protocols in 2025 marks a pivotal moment in the evolution of AI application development. MCP aims to standardize interfaces to break down data silos, enabling LLMs to access external resources efficiently and facilitating seamless data flow across systems and platforms. A2A further promotes seamless interaction between agents, fostering collaboration and communication to form cohesive, integrated systems.
The shift from MCP to A2A underscores the growing emphasis on ‘openness’ as a key driver in the AI application ecosystem. This openness encompasses both technical interoperability and collaborative spirit. From a broader perspective, this transformation reflects a natural progression in technology development: a transition from initial excitement to practical implementation, and from isolated innovation to collaborative ecosystem evolution.
Historically, the value of LLMs has been disproportionately attributed to parameter scale and standalone capabilities. Today, MCP and A2A address the critical issue of interconnectivity between AI applications and reshape the competitive dynamics of the LLM ecosystem. AI application development is evolving from a ‘lone wolf’ approach to a model of interconnectedness. This necessitates a reassessment of AI value for CTOs, shifting the focus from merely pursuing model size and ‘all-in’ strategies to leveraging platforms that connect diverse AI capabilities. The goal is to organically embed AI into existing business processes and production systems, improve overall efficiency through collaboration and standardization, solve critical problems with minimal computational resources, and overcome the ‘ROI dilemma.’
The Scourge of Wasted Compute and Misaligned Scenarios
The inability to overcome the high-investment, low-output bottleneck has long plagued the implementation of LLMs. This phenomenon reflects deep-seated contradictions in AI development. First, there is significant waste in computing power. Data indicates that enterprise-level general-purpose computing centers operate at only 10-15% utilization, leaving vast amounts of computing resources idle. Second, there is a misalignment of scenarios where model performance does not meet the actual needs of business scenarios.
One common issue is the ‘overkill’ of using large models for lightweight tasks. Some businesses excessively rely on general-purpose LLMs for simple applications. Additionally, the unique nature of business scenarios creates dilemmas. Using large models incurs high computational costs and long inference times. Opting for smaller models may not satisfy business requirements. This conflict is particularly evident in business scenarios requiring specialized domain knowledge.
Consider the talent-job matching scenario in the recruitment industry. Companies require models with deep reasoning abilities to understand the complex relationships between resumes and job descriptions while also demanding quick response times. The lengthy inference times of general-purpose LLMs can significantly degrade user experience, especially under high-concurrency user demands.
To balance performance and efficiency, model distillation has gained traction in recent years. The launch of DeepSeek-R1 earlier this year has further highlighted the value of this technique. In handling complex reasoning tasks, model distillation captures the ‘chain of thought’ pattern of DeepSeek-R1, allowing lightweight student models to inherit its reasoning abilities rather than merely mimicking output results.
For instance, Zhaopin, a leading recruitment platform, employed DeepSeek-R1 (600+ billion parameters) as a teacher model to distill the chain of thought and decision-making logic used in talent-job matching tasks. They used the Baidu AI Cloud Qianfan model development platform to distill the teacher model and transfer it to the ERNIE Speed model (10+ billion parameters), the student model. This approach achieved performance comparable to the teacher model (DeepSeek-R1 achieved 85% accuracy in reasoning link results, while the student model achieved over 81%), improved inference speed to an acceptable level, and reduced costs to 30% of the original while achieving 1x faster speeds than the full-fledged DeepSeek-R1.
Currently, businesses typically adopt two approaches to model distillation: building a complete technical system from infrastructure and GPUs to training frameworks, or using platform-based solutions like Qianfan model development platform or other vendors. Yao Sijia, an AI application expert at Zhaopin, stated that while Zhaopin has its own training framework, they chose the Qianfan model development platform for model distillation due to three main considerations:
- Comprehensive support: Qianfan model development platform provides industry-leading support for model distillation, deeply optimizing the entire technical chain around distillation scenarios.
- Cost control: Compared to purchasing and maintaining hardware independently, Qianfan model development platform offers significant advantages in cost control and more flexible resource allocation.
- Deep understanding of business scenarios: Baidu’s professional solutions team deeply understands core requirements such as ‘accurate matching’ and ‘high-concurrency response’ in the recruitment domain and collaborates with companies to explore solutions.
Yao Sijia added that Zhaopin will continue to pioneer AI+ recruitment scenarios, using Qianfan’s Reinforcement Learning Fine-Tuning (RFT) technology to further improve model performance. They plan to explore whether the teacher model can be further enhanced and whether better reward mechanisms can optimize already-distilled student models to improve accuracy. Qianfan is the first platform in China to productize leading reinforcement learning methods such as RFT and GRPO. By transforming these cutting-edge reinforcement learning methods into implementable solutions, Qianfan offers companies like Zhaopin more possibilities for optimizing model performance.
However, model distillation only optimizes the performance of a single model. In complex business scenarios, it is necessary to precisely match diverse AI capabilities with scenarios.
Consider a smartphone. In intent recognition scenarios like call assistants, lightweight models are typically used to quickly identify user issues. For general knowledge Q&A scenarios like weather queries and news retrieval, medium-sized models are typically used to quickly provide accurate and informative answers. In data analysis and logical reasoning scenarios that require deep thinking, large models are typically used.
This means that a smartphone needs to flexibly call multiple LLMs in different user demand scenarios. For phone manufacturers, this presents challenges such as high model selection costs and complex calling processes due to different model interface protocols.
To address these industry pain points, Qianfan model development platform productized model routing interfaces. Compared to directly using original factory models, it provides custom development and out-of-the-box API calling product capabilities, helping companies save engineering workload and development time while reducing costs. In addition, Qianfan model development platform supports flexible calling for large-scale users, ensuring speed and stability even under high-frequency and high-concurrency calling demands.
At the model level, technical capabilities such as model distillation and multi-model calling are helping more and more companies optimize resource allocation, enabling AI capabilities to precisely match business scenarios while reducing costs. At the application level, MCP and A2A, which have garnered significant industry attention, further reduce AI trial-and-error costs, help companies optimize application collaboration paradigms, and change the inefficient ‘re-inventing the wheel’ model in traditional agent development.
A ‘combination punch’ from models to applications is the perfect answer to helping LLMs overcome the ‘ROI dilemma.’
From Closed to Open: Lowering the Barrier to AI Experimentation
Since 2023, the key word for AI application implementation has gradually shifted to Agent. By 2024, almost all companies are discussing Agent applications and development. However, Agents at that time lacked true planning capabilities and were primarily based on workflow perspectives, connecting LLMs with basic applications by stitching or proceduralizing components through expert-driven rules.
With the recent rise of the MCP and A2A protocols, 2025 has become the true ‘Agent Year Zero.’ In particular, MCP’s impact on the AI field is comparable to that of the TCP/IP protocol on the Internet.
Zhou Ze’an, CEO of Biyao Technology, stated in an interview with InfoQ that MCP’s core value for the AI field is reflected in three dimensions:
- Standardization of LLM tool calling: In the past, each company had its own Function Call implementation, with significant differences between them. MCP establishes a unified access standard, enabling true standardization of application scheduling schemes between clients and servers. Additionally, MCP enables interaction not only between LLMs that support Function Call but also with LLMs that do not have this feature.
- Solving tool collaboration challenges: The unified standard of the MCP protocol makes the construction of Agent services more diverse. Developers need to consider not only their own Agents and MCP services but also how to integrate external capabilities to achieve more powerful Agent functions.
- Controlling the entire context through LLMs, resulting in a more user-friendly interaction: When building processes, it can use a wider range of data sources to solve complex tasks that were previously impossible.
‘In general, the MCP protocol significantly lowers the barrier for companies to adopt AI technology. In the past, the technical integration process for accessing Agents was complex. Now, companies no longer need to deeply understand complex technical implementation details but only need to clarify their business needs,’ Zhou Ze’an said. Biyao Technology has fully opened the document processing capabilities of its self-developed human resources industry vertical LLM ‘Bole’ through the MCP protocol, including contracts, resumes, and PPTs, and became one of the first enterprise developers to launch MCP components on the Qianfan application development platform. Currently, any enterprise or individual developer can directly call its professional capabilities on the Qianfan platform.
‘Baidu will help developers actively and comprehensively embrace MCP.’ At the Create2025 Baidu AI Developer Conference held on April 25, the Qianfan platform officially launched enterprise-level MCP services. Baidu founder Li Yanhong demonstrated the case of the Qianfan platform embracing MCP, allowing developers to flexibly access 1000 MCP Servers, including Baidu AI search, maps, and Wenku, when creating Agents. In addition, Qianfan launched a low-code tool for creating MCP Servers, allowing developers to easily develop their own MCP Servers on Qianfan and publish them to the Qianfan MCP Square with one click. These MCP Servers will also be promptly indexed by Baidu search, allowing them to be discovered and used by more developers.
In fact, Qianfan has been continuously solving the last mile problem of AI implementation before the rise of the MCP protocol, helping companies efficiently and with low barriers to enjoy the benefits of AI technology and providing mature solutions for multiple industries.
For example, in the smart home industry, companies generally face a common problem: how to provide accurate intelligent services for massive product models? With the accelerated implementation of LLMs, more and more companies are using Agents to quickly provide users with accurate and personalized answers. However, this also brings a new challenge: how to develop and manage numerous Agents? Smart home brands typically have many different product categories and models. Building an Agent for each product separately would not only result in high development costs but also significant management and maintenance costs in the later stages.
For example, a leading smart home brand used the Baidu AI Cloud Qianfan application development platform to treat file names as independent slices and embed file name slice information into each fine-grained slice. Instead of building an Agent for each product separately, they only needed to sort out the corresponding knowledge base and define the product model names. Then, they could use the Qianfan platform’s RAG framework automatic parsing strategy to achieve precise matching of product models and knowledge points.
The Qianfan application development platform also provides the brand with a set of operations tools to build a continuously evolving intelligent hub. Through the data backflow function, all user interaction records are transformed into optimization materials. Operations personnel can view high-frequency problems in real time and immediately intervene on uncovered knowledge points, forming a ‘operation - feedback - optimization’ closed loop. In addition, the Qianfan application development platform and Xiaodu AI Assistant jointly built a voice interaction framework. Relying on this framework, the brand can enable hardware to ‘talk’ directly with users, achieving a more natural, efficient, and personalized interactive experience.
From MCP to A2A, openness has become a new key word in the LLM application ecosystem. Openness is also the original intention of the Qianfan platform. From the first day of its release in 2023, Qianfan has adopted the most open posture to access a wealth of third-party LLMs. Currently, Qianfan has access to more than 100 models from over 30 model vendors, covering 11 types of capabilities such as text, image, and deep reasoning, including third-party models such as DeepSeek, LLaMA, Tongyi, and Vidu. It also provides a full range of Wenxin LLMs, including the newly released native multi-modal model Wenxin 4.5 Turbo and deep thinking model Wenxin X1 Turbo, as well as the previously released deep thinking model Wenxin X1.
For companies that want to quickly implement AI technology, Baidu AI Cloud is gradually becoming the first choice. Market data is the best proof. Currently, the Qianfan platform serves over 400,000 customers, with a penetration rate of over 60% in central enterprises. According to the China Large Model Bidding Project Monitoring and Insight Report (2025Q1), Baidu achieved double first in the number of large model bidding projects and the amount of winning bids in the first quarter: winning 19 large model bidding projects with a disclosed project amount of over 450 million yuan, and the winning large model projects were almost all from central state-owned enterprise customers in industries such as energy and finance.
Baidu AI Cloud’s report card also sends a signal to the outside world: in this long-term battle for AI technology implementation, only those solutions that truly understand industry pain points and can help companies reduce trial-and-error costs are the most vital.