Google’s Strides in Healthcare AI
Google recently unveiled a suite of Health AI updates at its annual ‘The Check Up’ event, showcasing the company’s commitment to leveraging AI for diverse healthcare applications. These updates span from enhancing health-related queries in Google Search to introducing new ‘open’ AI models designed to boost the efficiency of AI-powered drug discovery.
Enhancing Health Information Access via Google Search
Google is deploying AI and sophisticated quality and ranking systems to broaden the scope of ‘knowledge panel’ answers for a wide range of health-related topics. This expansion includes adding support for healthcare queries in multiple languages, such as Spanish, Portuguese, and Japanese, initially on mobile platforms. While Search already furnished knowledge panel answers for prevalent health concerns like influenza or the common cold, this update significantly enlarges the array of topics these panels encompass.
Beyond this, Google introduces a novel feature in Search called ‘What People Suggest.’ This feature is designed to present users with information derived from individuals who have shared similar medical experiences. This addition offers a unique avenue for users to obtain insights. It allows users to quickly discover authentic perspectives from others with the same condition, complete with links for further exploration. ‘What People Suggest’ is presently accessible on mobile devices within the United States.
Streamlining Medical Records with New APIs
Google has also globally launched new medical records application programming interfaces (APIs) for its Health Connect platform, compatible with Android devices. These APIs empower applications to both read and write medical record data, encompassing allergies, medications, immunizations, and lab results, all in the standardized FHIR format. These enhancements bring Health Connect’s support to over 50 data types, spanning activity, sleep, nutrition, vital signs, and now medical records. This integration facilitates a seamless connection between users’ daily health data and information from their healthcare providers.
The AI Co-Scientist: A Virtual Research Partner
A groundbreaking innovation from Google is the ‘AI co-scientist,’ a novel system underpinned by Gemini 2.0. This system is envisioned as a ‘virtual scientific collaborator’ for researchers and scientists. The AI co-scientist is designed to assist researchers in navigating extensive scientific literature, thereby facilitating the generation of new hypotheses. By aiding in the analysis of vast datasets and complex research papers, the AI co-scientist aims to empower experts to uncover novel ideas and expedite their research endeavors. Google is actively collaborating with institutions like Imperial College London, Houston Methodist, and Stanford University to explore the practical applications of this tool and intends to initiate a trusted tester program.
TxGemma: Accelerating Drug Discovery
Google also introduced TxGemma, a compilation of Gemma-based open models intended to enhance the efficiency of AI-driven drug discovery. TxGemma possesses the capability to comprehend both standard text and the structures of various therapeutic entities, including small molecules, chemicals, and proteins. The release of TxGemma is slated for the near future.
Capricorn AI Tool: Advancing Pediatric Oncology
In collaboration with the Princess Maxima Center for Pediatric Oncology in the Netherlands, Google has been developing an AI tool named Capricorn. This tool underscores Google’s dedication to applying AI to specialized medical fields, particularly in pediatric oncology.
AI’s Broader Impact on Healthcare
Google has previously highlighted AI’s positive influence on global health outcomes. The company has developed AI models to aid in the detection of diseases such as breast cancer, lung cancer, and diabetic retinopathy. In May 2024, Google announced Med-Gemini, a family of Gemini models fine-tuned for multimodal medical applications. Further, in June 2024, Google introduced the Personal Health Large Language Model for mobile and wearable devices. This fine-tuned version of Gemini is designed to interpret sensor data and provide personalized insights and recommendations regarding an individual’s sleep and fitness patterns.
xAI’s Acquisition of Hotshot: A Move into Generative AI Video
Elon Musk’s AI venture, xAI, has acquired Hotshot, a startup specializing in AI-powered video generation tools. This acquisition positions xAI to compete with OpenAI’s Sora, a leading platform in the generative AI video space. Hotshot announced on its website that it began phasing out new video creation on March 14th, with existing customers having until March 30th to download their created videos. This strategic move signals xAI’s intent to expand its capabilities beyond chatbots and into the rapidly evolving field of AI-generated video content. The integration of Hotshot’s technology could potentially enhance xAI’s offerings and provide users with advanced video creation tools.
Grok 3: xAI’s Ambitious AI Chatbot
On February 19th, xAI unveiled Grok 3, the latest iteration of its chatbot, which Elon Musk proclaimed as ‘the smartest AI on Earth.’ Subsequently, the company announced the beta release of two reasoning models, Grok 3 (Think) and Grok 3 Mini (Think). xAI stated that Grok 3, trained on their Colossus supercluster with ten times the computational power of previous state-of-the-art models, exhibits substantial improvements in reasoning, mathematics, coding, world knowledge, and instruction-following tasks. The development of Grok 3 underscores xAI’s commitment to pushing the boundaries of AI capabilities and creating increasingly sophisticated and intelligent AI systems. The emphasis on reasoning and mathematical abilities suggests a focus on developing AI that can tackle complex problem-solving tasks.
Mistral AI’s Mistral Small 3.1: Compact and Powerful
French AI startup Mistral AI introduced a new open-source model on March 17th, named Mistral Small 3.1. The company asserts that this model surpasses comparable models like Google’s Gemma 3 and OpenAI’s GPT-4o Mini, thereby intensifying competition in a market largely dominated by US tech giants. This move highlights Mistral AI’s ambition to establish itself as a major player in the global AI landscape.
Mistral Small 3.1 processes both text and images with 24 billion parameters – a significantly smaller size compared to leading proprietary models – while matching or exceeding their performance. Mistral AI emphasized that Mistral Small 3.1 is the first open-source model to not only meet but surpass the performance of leading small proprietary models across various dimensions. This achievement demonstrates that high performance can be achieved with smaller, more efficient models, challenging the notion that larger models are always superior.
Building upon Mistral Small 3, this new model boasts enhanced text performance, multimodal understanding, and an expanded context window of up to 128,000 tokens. Mistral AI claims the model processes information at speeds of 150 tokens per second, making it suitable for applications demanding rapid response times. The increased context window allows the model to handle longer and more complex inputs, while the high processing speed ensures real-time responsiveness.
Versatility and Accessibility of Mistral Small 3.1
Mistral Small 3.1 is designed to run on hardware as accessible as a single RTX 4090 or a Mac with 32GB RAM, making it highly suitable for on-device applications. The model can be fine-tuned for specialized domains, enabling the creation of highly accurate subject matter experts, particularly useful in fields such as legal advice, medical diagnostics, and technical support. The accessibility of Mistral Small 3.1 democratizes access to powerful AI capabilities, allowing developers and researchers with limited resources to utilize and build upon the model. The ability to fine-tune the model for specific tasks further enhances its versatility and applicability across various industries.
The new model is tailored for a broad spectrum of enterprise and consumer applications requiring multimodal understanding. Potential use cases include document verification, diagnostics, on-device image processing, visual inspections for quality control, object detection in security systems, image-based customer support, and general-purpose assistance. The wide range of potential applications underscores the model’s versatility and its potential to impact various aspects of daily life and business operations.
Mistral OCR: Advanced Document Understanding
Earlier in March, Mistral AI announced Mistral OCR, which the company touts as the ‘World’s best document understanding API.’ Mistral OCR is an Optical Character Recognition (OCR) API capable of extracting text, tables, equations, and images from complex documents. Mistral AI believes this technology will revolutionize how organizations process and utilize vast information repositories. This development highlights Mistral AI’s commitment to providing comprehensive AI solutions that address real-world challenges.
According to the company, Mistral OCR processes up to 2000 pages per minute, supports multilingual and multimodal capabilities, and delivers structured outputs like JSON for seamless integration into AI workflows. Internal tests indicate that Mistral OCR leads the market in text extraction accuracy, especially for scanned documents, mathematical content, and multilingual text. Unlike traditional OCR solutions, it also extracts embedded images, making it ideal for scientific research, regulatory filings, and historical document digitization. The high processing speed, multilingual support, and structured output format make Mistral OCR a powerful tool for automating document processing tasks. The ability to extract images in addition to text further enhances its utility for a wide range of applications.
Mistral AI reports that OCR is already assisting enterprises and research institutions in digitizing literature, streamlining customer service, and preserving historical archives. Additionally, OCR is helping companies convert technical literature, engineering drawings, lecture notes, presentations, regulatory filings, and more into indexed, answer-ready formats. Mistral OCR capabilities are available for free trial on le Chat, and the company anticipates further improvements to the model in the coming weeks. These ongoing developments reflect the dynamic nature of AI and its potential to reshape diverse industries. The real-world applications of Mistral OCR demonstrate the practical benefits of AI in improving efficiency and accessibility of information. The continuous improvement of the model underscores Mistral AI’s commitment to innovation and staying at the forefront of AI technology. The availability of a free trial encourages wider adoption and exploration of the technology’s capabilities.