Cohere Command A and Rerank Models on OCI Generative AI | en

Cohere Command A: Peak Performance and Efficiency

The Oracle Cloud Infrastructure (OCI) Generative AI service has received significant upgrades, including the release of the Cohere Command A and Rerank 3.5 models, as well as the introduction of Cohere Embed 3 with multimodal support. These new models are designed to provide OCI customers with more powerful enterprise-grade AI capabilities and further enhance their AI application capabilities in various application scenarios.

Cohere’s Command A 03-2025 is currently the most powerful Command model, with 150% higher throughput than its predecessor, while requiring only two GPUs. According to data provided by Cohere, this model performs comparably to or even exceeds OpenAI 4o and DeepSeekv3 in proxy-based enterprise tasks, with significant improvements in computational efficiency.

Command A’s superior performance stems from its advanced architecture and training methods, enabling it to excel in a variety of complex enterprise-grade AI applications. Whether processing massive amounts of data, performing complex reasoning tasks, or conducting real-time natural language processing, Command A delivers efficient and reliable solutions.

Key Features of Command A Include:

Ultra-Long Context Window: Supports context lengths up to 256k tokens, enabling the model to process longer text sequences, thereby better understanding contextual information and generating more accurate and coherent responses. This means that Command A can handle complex documents, lengthy conversations, and multi-turn interactions without losing critical information. The extended context window allows for comprehensive analysis and synthesis of information, leading to improved decision-making and more nuanced understanding. This capability is particularly beneficial for applications requiring deep insights from extensive textual data.
Advanced Retrieval Augmented Generation (RAG): By integrating retrieval augmented generation technology, Command A can retrieve relevant information from massive datasets and incorporate it into the generated content, thereby improving the quality and accuracy of the generated results. This technology not only reduces the model’s reliance on external knowledge but also allows it to better adapt to the constantly changing information environment. RAG enhances the model’s ability to provide contextually relevant and up-to-date information, making it ideal for applications requiring accurate and reliable responses. The integration of retrieval mechanisms ensures that the model has access to the latest information, reducing the risk of generating outdated or irrelevant content.
Native Agent Tool Usage: Command A has native agent tool usage capabilities, allowing it to integrate with other tools and services to achieve more complex functions. For example, it can interact with search engines, databases, APIs, etc. to obtain the required information or perform specific operations. This capability enables Command A to handle a variety of complex tasks, such as automated customer service, intelligent assistants, and data analysis. Native agent tool usage allows the model to seamlessly interact with external systems, automating complex workflows and streamlining processes. This capability is essential for building intelligent applications that require access to real-world data and services.
Enterprise-Grade Security and Privacy: Command A is designed with enterprise-level security and privacy needs in mind, and uses a variety of security measures to protect customer data. For example, it supports data encryption, access control, and auditing functions to ensure that customer data is not subject to unauthorized access or leakage. Data encryption protects sensitive information from unauthorized access, while access control mechanisms ensure that only authorized users can access the model and its data. Auditing functions provide a record of all activities performed on the model, enabling organizations to track and monitor usage.
Powerful Multilingual Capabilities: Command A is trained on 23 languages, including English, French, Spanish, Italian, German, Portuguese, Japanese, Korean, Arabic, Chinese, Russian, Polish, Turkish, Vietnamese, Dutch, Czech, Indonesian, Ukrainian, Romanian, Greek, Hindi, Hebrew, and Persian. This enables it to process text in various languages and provide services to global users. The extensive multilingual training ensures that the model can accurately understand and generate text in a wide range of languages, making it ideal for global organizations. The ability to process multiple languages allows for seamless communication and collaboration across different regions and cultures.
Text Input and Output: Command A currently only supports text input and output, which means it is primarily used for text-related tasks, such as text generation, text summarization, text translation, and text classification. The focus on text input and output allows the model to excel in natural language processing tasks, providing accurate and reliable results. While currently limited to text, future iterations may incorporate multimodal capabilities, expanding the range of potential applications.

Note: The Command A model does not currently support fine-tuning.

Rerank 3.5: Improving the Accuracy of Enterprise Search

Rerank 3.5 is Cohere’s latest AI search foundation model, designed to improve the accuracy of enterprise search and retrieval augmented generation (RAG) systems. This model has enhanced reasoning capabilities, can understand complex user queries, and is compatible with various data types (including long documents, emails, tables, JSON, and code). In addition, Rerank 3.5 supports more than 100 languages to meet the search needs of global enterprises.

Rerank 3.5 improves the search efficiency and satisfaction of users by re-ordering search results to put the most relevant results first. It can be applied not only to traditional text searches, but also to various other types of searches, such as image search, video search, and audio search.

Key Features of Rerank 3.5 Include:

Enhanced Reasoning Capabilities: Rerank 3.5 has enhanced reasoning capabilities and can better understand complex user queries. By analyzing the semantics and context of the query, it can accurately identify the user’s intention and return the most relevant results. The enhanced reasoning capabilities enable the model to accurately interpret user queries, even when they are complex or ambiguous. This leads to improved search results and increased user satisfaction. The ability to understand the underlying intent of a query is crucial for delivering relevant and accurate information.
Diverse Data Support: Rerank 3.5 is compatible with various data types, including long documents, emails, tables, JSON, and code. This means that it can process data from various sources and extract useful information from it. The ability to process diverse data types allows the model to access and analyze information from a wide range of sources, providing a comprehensive view of the available data. This is particularly important for enterprise search, where data may be stored in various formats and locations.
Improved Multilingual Support: Rerank 3.5 supports more than 100 languages, including major business languages such as English, Arabic, Chinese, French, German, Hindi, Japanese, Korean, Portuguese, Russian, and Spanish. This enables it to provide high-quality search services to global users. The extensive multilingual support ensures that users can search for information in their native language and receive relevant results. This is essential for global organizations, where users may be located in different regions and speak different languages.
Higher Search Accuracy: In tests targeting financial data, Rerank 3.5 outperformed Hybris Search by 23.4% and BM25 by 30.8%. BM25 is a commonly used ranking function used in search engines and information retrieval systems to determine the relevance of a document to a given search query. The superior performance of Rerank 3.5 demonstrates its ability to deliver more accurate and relevant search results compared to traditional search methods. The model’s ability to outperform BM25 highlights its advanced ranking algorithms and its ability to understand the nuances of user queries.

Expanded Language Support: How Rerank 3.5 Supports More Than 100 Languages

Rerank 3.5’s multilingual capabilities are reflected in its ability to understand and process queries from more than 100 languages. This means that it not only understands the literal meaning of the query, but also the cultural background and context behind it. For example, if a user searches in Spanish for “mejores restaurantes en Madrid”, Rerank 3.5 understands that the user’s intention is to find the best restaurants in Madrid and returns relevant Spanish search results.

In order to achieve this goal, Rerank 3.5 uses a variety of technologies, including:

Multilingual Training Data: Rerank 3.5 has been trained on a large amount of multilingual data, including various types of text, such as news articles, blog posts, social media posts, and product reviews. The use of multilingual training data allows the model to learn the nuances of different languages and to accurately understand user queries in a variety of languages. This ensures that the model can provide relevant search results, regardless of the language used by the user.
Cross-Lingual Embeddings: Rerank 3.5 uses cross-lingual embedding technology to map words in different languages to the same vector space. This enables the model to understand the semantic relationships between different languages and return relevant cross-lingual search results. Cross-lingual embeddings allow the model to represent words in different languages in a common semantic space, enabling it to understand the relationships between words and concepts across languages. This is crucial for providing accurate and relevant search results in multilingual environments.
Language Detection and Translation: Rerank 3.5 can automatically detect the language of the user query and translate it into English or other supported languages. This enables the model to process queries in various languages and return relevant search results. Automatic language detection and translation allow the model to handle a wide range of user queries, regardless of the language in which they are formulated. This ensures that users can search for information in their native language and receive relevant results.

By adopting these technologies, Rerank 3.5 can provide high-quality search services to global users, regardless of the language they use to search.

Enhanced Reasoning Capability: How Rerank 3.5 Understands Complex Queries

Rerank 3.5’s reasoning ability is reflected in its ability to understand complex queries and extract useful information from them. For example, if a user searches for “Which tech companies have performed better than last year’s stocks,” Rerank 3.5 understands that the user’s intention is to find those tech companies whose stocks have performed better than last year.

In order to achieve this goal, Rerank 3.5 uses a variety of technologies, including:

Semantic Analysis: Rerank 3.5 uses semantic analysis technology to analyze the semantic structure and context of the query. This enables the model to understand the meaning of the query and identify the user’s intent. Semantic analysis allows the model to understand the relationships between words and concepts in a query, enabling it to accurately interpret the user’s intent. This is crucial for providing relevant and accurate search results.
Entity Recognition: Rerank 3.5 uses entity recognition technology to identify entities in the query, such as companies, locations, and people. This enables the model to relate the query to relevant entities and return relevant search results. Entity recognition allows the model to identify and classify entities in a query, such as people, organizations, and locations. This enables the model to understand the context of the query and to provide more relevant search results.
Relationship Extraction: Rerank 3.5 uses relationship extraction technology to extract relationships between entities in the query. This enables the model to understand the meaning of the query and return relevant search results. Relationship extraction allows the model to identify and extract relationships between entities in a query, such as “employee of” or “located in.” This enables the model to understand the context of the query and to provide more relevant search results.

By adopting these technologies, Rerank 3.5 can understand complex queries and return relevant search results, thereby improving user search efficiency and satisfaction.

How OCI Customers Can Leverage These Models:

OCI customers can leverage these Cohere models in a variety of ways, including:

Instant Integration: These models can be seamlessly accessed via chat interfaces, APIs, or dedicated endpoints, without worrying about infrastructure management. This makes it easy for customers to integrate these models into their applications without complex configuration and deployment. The ease of integration allows customers to quickly and easily leverage the power of these models without the need for specialized expertise.
Simplified AI Development: The OCI Generative AI service provides a comprehensive set of tools and services to help customers simplify AI development processes. These tools and services include:
- Data Preparation: The OCI Generative AI service provides a range of data preparation tools to help customers clean, transform, and prepare data for use in AI model training and inference. These tools include data cleaning, data transformation, and data validation capabilities. These tools help to ensure that the data used for AI model training is accurate and consistent, leading to improved model performance.
- Model Training: The OCI Generative AI service provides a range of model training tools to help customers train their own AI models. These tools support various different model types and frameworks, such as TensorFlow, PyTorch, and Scikit-learn. These tools provide a flexible and powerful environment for training AI models, allowing customers to customize their models to meet their specific needs.
- Model Deployment: The OCI Generative AI service provides a range of model deployment tools to help customers deploy trained AI models to production environments. These tools make it easy to deploy models to a variety of platforms, including cloud, on-premises, and edge environments.
- Model Monitoring: The OCI Generative AI service provides a range of model monitoring tools to help customers monitor the performance and accuracy of AI models. These tools provide real-time insights into model performance, allowing customers to identify and address issues before they impact business operations.
Simplified RAG Workflow: Use Command A for content generation and Rerank 3.5 to optimize and enhance results, making complex RAG processes more efficient and streamlined. The combination of Command A and Rerank 3.5provides a powerful and efficient solution for building RAG-based applications. Command A generates the initial content, while Rerank 3.5 optimizes the results to ensure that the most relevant information is presented to the user.

Diversity of Application Scenarios:

These models can be applied to a variety of different enterprise application scenarios, including:

Customer Service: Command A and Rerank 3.5 can be used to build intelligent customer service robots that can answer customer questions, resolve customer doubts, and provide personalized services. These chatbots can automate customer interactions, freeing up human agents to focus on more complex issues.
Content Generation: Command A can be used to generate various types of text content, such as news articles, blog posts, product descriptions, and social media posts. This can help organizations to create high-quality content quickly and efficiently.
Search: Rerank 3.5 can be used to improve the accuracy and efficiency of enterprise search, helping users quickly find the information they need. This can improve productivity and reduce the time spent searching for information.
Data Analysis: Command A and Rerank 3.5 can be used to analyze various types of data, extract useful information from them, and help companies make better decisions. This can help organizations to identify trends, patterns, and insights that would otherwise be difficult to detect.
Knowledge Management: Intelligent knowledge bases can be built, allowing employees to quickly find the information they need and improve work efficiency. This can improve collaboration and knowledge sharing within the organization.

By providing high-performance, versatile, and scalable AI models, the OCI Generative AI service empowers companies to build a variety of innovative AI solutions, thereby enhancing their competitiveness and business value.

For integration details and pricing information, please refer to our Generative AI service documentation or contact your Oracle representative.

updated at 2025-05-16

# RAG # Cohere # Command