Decoding A2A and MCP Protocols in the Agent World
Recently, Google unveiled a new open protocol for Agents called Agent2Agent, or A2A for short. Simultaneously, Alibaba Cloud’s Bailian also announced its foray into MCP. Let’s delve into what A2A and MCP are all about.
To understand these protocols, consider the analogy of diplomacy between nations. Imagine each AI agent as a small country with its own language and customs. These ‘countries’ have embassies housed within the same building, attempting to communicate, trade, and exchange information.
In an ideal scenario, these nations would maintain amicable relations and adhere to a clear set of diplomatic rules, enabling them to seamlessly interact, sign agreements, and collaborate on international projects around a conference table.
However, the reality is that each embassy operates independently with disparate protocols. Consequently, initiating a simple trade agreement with ‘Country A’ requires fulfilling a plethora of requirements, including provisions, certifications, translations, and specialized keys. Engaging with ‘Country B’ and ‘Country C’ necessitates repeating similar procedures multiple times. This ad-hoc, fragmented, and multi-faceted approach inflates communication costs, with each interaction incurring an additional ‘information tariff.’
In the past, AI agents encountered similar predicaments when attempting to collaborate.
For instance, you might have an agent that automatically responds to emails and another integrated into a calendar application to assist with scheduling. However, these AI entities struggle to communicate directly, necessitating manual copying and pasting of information or reliance on custom-built interfaces.
As a result, AI agents operate in isolation, exhibiting poor interoperability. This fragmentation frustrates users who must navigate between multiple AI applications and limits the potential of AI. Complex tasks that could be accomplished through multi-agent collaboration are artificially confined within individual silos.
This situation mirrors the post-World War II landscape, where each AI agent acts autonomously, lacking unified rules and facing communication barriers. The current AI ecosystem resembles a post-war wasteland, requiring adherence to specific interfaces and protocols for accessing data and functionalities. The absence of standards imposes additional ‘tariffs’ with each new collaborative relationship, leading to a disjointed and inefficient AI ecosystem characterized by isolation and self-interest.
The AI industry is exploring the possibility of establishing a universally accepted protocol to facilitate seamless interaction between agents and external tools. Google and Anthropic have emerged as frontrunners, each proposing a solution: the A2A protocol and the MCP protocol.
The A2A Protocol
The A2A protocol, short for Agent2Agent, enables AI agents to communicate and collaborate directly.
The primary objective of the A2A protocol is to enable agents from diverse origins and vendors to comprehend and cooperate with one another, similar to the World Trade Organization’s efforts to reduce trade barriers.
By adopting A2A, agents from different vendors and frameworks can join a free trade zone, communicating using a common language and seamlessly collaborating to accomplish complex tasks beyond the capabilities of individual agents.
To illustrate how A2A operates, consider the following analogies:
1. Agent = National Diplomat
Each agent functions as a diplomat representing a country’s embassy. The A2A protocol aims to establish uniform diplomatic etiquette and communication procedures. Previously, diplomats from ‘Country A’ communicated exclusively in French, while those from ‘Country B’ utilized Cyrillic script, and ‘Country C’ demanded correspondence via ancient gold-leaf letters. The A2A protocol ensures that all participants can communicate in a pre-agreed language, submit documents in the same format, and execute agreed-upon outcomes.
2. Agent Card = Diplomatic Credentials / Ambassador’s Business Card
Within the A2A framework, each agent is required to publish an ‘Agent Card,’ analogous to a diplomat’s business card, containing details such as the agent’s name, version, capabilities, and supported languages or formats.
Similar to how a diplomat’s business card identifies their role and affiliation, the Agent Card lists the agent’s skills, authentication methods, and input/output formats. This enables other diplomats to quickly identify and understand capabilities, minimizing communication barriers.
3. Task = Bilateral or Multilateral Diplomatic Project
The Task concept is central to A2A. When an agent intends to delegate a task to another agent, it issues a ‘cooperation project letter of intent.’ Upon acceptance, both parties record a Task ID to track progress and exchange information until completion.
In diplomatic terms, a nation might propose to another, ‘We wish to collaborate on constructing a cross-border high-speed rail line; please dispatch your engineering team.’ This mirrors an A2A Task, where the initiating party outlines requirements, the remote agent accepts, and both parties regularly update progress throughout the project.
Messages represent communications exchanged during the project’s initial or intermediate stages, akin to diplomatic cables, notes, and envoy exchanges.
4. Push Notifications = Diplomatic Embassy Bulletins
In A2A, if a Task is a long-term project requiring extended completion time, the remote agent can update the initiating party through push notifications, similar to a country providing periodic updates on a long-term infrastructure project. This enhances asynchronous collaboration capabilities.
5. Authentication and Security = Diplomatic Privileges and Protocols
A2A employs enterprise-grade authentication strategies, requiring both communicating parties to verify credentials to prevent impersonation or malicious eavesdropping. This mechanism parallels diplomatic privileges and protocols.
In essence, A2A mirrors the dynamics of international diplomacy or business collaboration, emphasizing standardized communication and security. It provides a structured framework for agents to interact, ensuring clarity, security, and efficient collaboration. The adoption of A2A will drastically reduce the overhead associated with agent-to-agent communication, fostering a more collaborative AI ecosystem where agents can seamlessly share tasks and information. This facilitates the development of more sophisticated and integrated AI solutions.
The MCP Protocol
The MCP protocol, or Model Context Protocol, is a standard introduced and open-sourced by Anthropic in November 2024.
While A2A addresses the communication process between AI diplomats, a persistent challenge remains: the absence of reliable information sources. Even the most eloquent diplomat or business executive is ill-equipped to operate effectively without accurate information about the international landscape and resource allocation.
Modern diplomats rely on external tools, such as visa systems, international settlement systems, and intelligence databases, to perform their duties. Similarly, an agent assuming complex responsibilities must connect to various databases, document systems, enterprise applications, and even hardware devices.
This can be likened to establishing a comprehensive intelligence agency for diplomats and granting them access to tools to facilitate their work.
Previously, agents had to develop custom plugins and deeply integrate with different tools, which was both laborious and time-consuming. However, MCP is now available to streamline the process.
MCP standardizes interactions between large language models and external data sources and tools. Anthropic likens MCP to a USB-C port for AI applications.
USB-C serves as a universal interface for devices, handling charging and data transfer through a single port. MCP aims to create a universal interface in the AI domain, enabling various models and external systems to connect using the same protocol, rather than developing custom integration solutions each time.
AI models connecting to databases, search engines, or third-party applications can communicate seamlessly if they all support MCP.
MCP employs a client-server architecture:
1. MCP Server = Consolidated Intelligence Agency
Organizations or individuals can encapsulate databases, file systems, calendars, and third-party services into MCP Servers. These servers adhere to the MCP protocol, exposing uniformly formatted access endpoints, enabling any agent compliant with MCP client standards to send requests, retrieve information, or execute operations. Think of it as a standardized API gateway for AI agents. It allows them to access a wide array of services without needing to know the specifics of each service’s individual API. This significantly reduces the complexity of integrating external resources into AI workflows.
2. MCP Client = Terminal Equipment Used by Diplomats
An agent diplomat carries dedicated terminal equipment, enabling them to input commands, such as ‘Retrieve inventory data from the financial system,’ ‘Submit a request to an API,’ or ‘Retrieve a PDF document.’
Without MCP, integrating with various systems requires writing different access codes, which is cumbersome. However, with MCP, clients supporting the protocol can easily switch between different MCP servers, retrieving information and executing business processes. This flexibility allows agents to adapt to different environments and access the necessary information and tools regardless of the underlying infrastructure. The MCP client acts as a universal adapter, abstracting away the complexities of interacting with different data sources and services.
In essence, MCP facilitates seamless integration between AI agents and external resources. It unlocks the potential for AI agents to interact with the real world, leveraging vast amounts of data and sophisticated tools to solve complex problems. The standardized interface simplifies development and deployment, accelerating the adoption of AI in various industries.
The Distinction Between A2A and MCP
To clarify the distinction between A2A and MCP, consider a hypothetical international summit where heads of state (representing companies’ AI Agents) gather to collaborate on a transnational task, such as producing a global economic analysis report.
Without a universal protocol, such a meeting would be virtually impossible, as each representative speaks a different language. However, with the A2A protocol, all representatives sign the ‘A2A Vienna Diplomatic Convention’ before entering the meeting, agreeing to communicate using a uniform format, identify themselves, state their intentions, and cite previous发言 IDs when responding.
This enables ‘Agent G’ to send a message to ‘Agent O’ in A2A format, and ‘Agent O’ responds accordingly. This marks the first instance of unimpeded communication between AI agents from different companies. A2A ensures that the agents can understand each other and exchange information in a standardized way. It establishes a common ground for communication, facilitating collaboration and knowledge sharing.
During the discussions, the AI representatives need to consult data or utilize tools for analysis. ‘Agent A’ from Anthropic suggests using the MCP system for external data or tool support.
A ‘MCP simultaneous interpretation room’ is set up alongside the conference hall, staffed by experts who can respond in a uniform language via MCP upon receiving requests. This room contains all of the required utilities for data processing, number crunching, and information retrieval that may be needed by any attending Agent.
For example, ‘Agent Q’ needs to access their cloud database for calculations. Instead of sending someone back to the country, they send an MCP request for data from database X. The MCP database administrator translates the request, retrieves the results, and responds to ‘Agent Q’ in MCP language. The entire process is transparent to the other agents, who understand the data cited by ‘Agent Q’ because the MCP translation is in a recognized format. MCP removes the technical burden of dealing with unique data formats and allows agents to ask and receive the data without any extra steps.
As the report writing progresses, ‘Agent G’ and ‘Agent A’ realize they need to integrate their respective contributions. ‘Agent G’ specializes in numerical analysis, while ‘Agent A’ excels at language summarization.
‘Agent G’ communicates the GDP growth rate data via A2A, and ‘Agent A’ connects to an Excel spreadsheet plugin via MCP, verifies the data trends, and responds with a summary paragraph. A2A handles the actual transfer of data, while MCP allows for external information to be processed in a way that is easily readable and understood between agents.
In this scenario, A2A facilitates communication between agents, while MCP enables agents to access external tools and information. Together, the protocols create a tailored communication agreement for an AI version of the United Nations. With these protocols in place, AI agents can collaborate effectively, forming an interconnected AI ecosystem. The combined efforts of A2A and MCP empower agents to form intelligent partnerships and handle complex, multifaceted issues that would otherwise be unattainable.
A2A is akin to a dedicated hotline for diplomatic communication, addressing direct agent communication. It ensures that the agents can speak the same language and understand each other. MCP is similar to a simultaneous interpretation and resource-sharing system, addressing the issue of intelligent entities connecting with external information. It provides the tools and data needed to support the agents’ decision-making and problem-solving processes.
The rise of A2A and MCP heralds the evolution of the AI industry towards collaboration rather than competition. Countless AI agents will be deployed like websites, discovering and communicating through A2A and accessing resources and sharing knowledge through MCP. This will lead to a more open, interconnected, and innovative AI ecosystem, where agents can leverage each other’s strengths to achieve common goals. The potential benefits are enormous, ranging from improved healthcare and education to more efficient resource management and sustainable development. As these protocols mature and are widely adopted, we can expect to see a rapid acceleration in the development and deployment of intelligent AI solutions that address some of the world’s most pressing challenges. The future of AI is collaborative, and A2A and MCP are paving the way for this exciting future. They provide the foundation for building a truly intelligent and interconnected world, where AI agents can work together to solve complex problems and improve the lives of people everywhere. These are key enablers of the next generation of AI applications.