1. Nvidia
The relentless pursuit of more advanced AI systems continues to fuel significant investments from large language model developers. However, one company is already reaping the rewards of this AI revolution: Nvidia. Having sparked the AI race with its dominant graphics processing units (GPUs), Nvidia is now ideally positioned with its groundbreaking Blackwell processor and platform to support the quest for human-level intelligence.
Blackwell significantly outperforms its predecessor, the H100, delivering up to 2.5 times the power for general model-training tasks while simultaneously consuming considerably less energy. Major data center operators and AI labs, including industry giants like Google, Meta, Microsoft, OpenAI, Tesla, and xAI, have committed to purchasing hundreds of thousands of Blackwell GPUs.
While recent models from Chinese companies like DeepSeek and Alibaba have demonstrated impressive capabilities using older, less powerful Nvidia GPUs, Nvidia isn’t simply resting on its past achievements. The company is actively developing platforms for a wide range of applications, spanning drug discovery (Clara for Biopharma), autonomous vehicles (Drive AGX), video production (Holoscan), and digital twins (Omniverse). By fostering AI progress across a diverse spectrum of real-world scenarios, Nvidia is strategically positioning itself for sustained growth, even if future models exhibit a reduced reliance on sheer computational power. This proactive approach ensures Nvidia remains at the forefront of the AI revolution, regardless of how the technology evolves.
2. OpenAI
Since 2019, OpenAI has consistently improved its models by expanding training data and computing resources, a strategy that has been widely adopted across the industry. However, as diminishing returns from this scaling approach became evident, OpenAI recognized the need for a new pathway to achieving AGI – models that surpass human intelligence in most tasks.
OpenAI’s solution came in the form of the o1 model. Instead of solely focusing on scaling up resources during pretraining, OpenAI engineered o1 to allocate more time and computing power during inference, the phase where the model is actively deployed and responding to user prompts. During this process, o1 gathers and retains contextual information, both from the user and relevant data sources. It employs a trial-and-error methodology to determine the optimal path to an answer. The result is the generation of PhD-level responses to complex questions, propelling o1 to the top of performance benchmark rankings.
OpenAI offers ‘experimental’ and ‘mini’ versions of o1 to ChatGPT Plus subscribers. Additionally, a premium service called ChatGPT Pro provides unlimited access to the full o1 model for $200 per month. In December 2024, OpenAI unveiled o1’s successor, o3, and in February 2025, granted paid users access to o3-mini, a smaller, faster variant optimized for science, math, and coding. The most profound impact of OpenAI’s new reasoning models is the validation of scaling up computing at inference time as a promising avenue for achievingfurther breakthroughs in intelligence on the road to AGI. This shift in focus from pretraining to inference-time reasoning represents a significant paradigm shift in the field of AI development.
3. Google DeepMind
The foundational research that paved the way for today’s chatbots originated at Google in the late 2010s. Google had developed a large language model-powered chatbot well before the emergence of ChatGPT. However, concerns regarding safety, privacy, and legal implications reportedly led to a cautious approach, delaying its public release. This hesitancy resulted in Google initially lagging behind in the ensuing AI race triggered by ChatGPT’s launch.
The release of Google DeepMind’s Gemini 2.0 in 2024 signaled Google’s definitive resurgence. Gemini 2.0 represents the first mass-market AI model that is inherently multimodal, capable of processing and generating images, video, audio, and computer code with the same fluency as text. This capability enables the model to analyze and reason about video clips, or even live video feeds from a phone camera, with remarkable speed and accuracy.
Gemini also stands out for its ability to control other Google services, such as Maps and Search. This integration showcases Google’s strategic advantage, combining its AI research with its established information and productivity tools. Gemini is among the first AI models demonstrating autonomous operation and the capacity to reason through complex problems on behalf of the user. The Gemini 2.0 Flash Thinking Experimental model even provides users with insights into the thought process employed to arrive at an answer. Furthermore, in December, Google introduced Project Mariner, a Gemini-based agentic AI feature designed to perform tasks like online grocery shopping autonomously. This demonstrates Google’s commitment to developing AI agents that can act independently in the real world.
4. Anthropic
The primary applications of generative AI have so far centered around text writing, summarization, and image generation. The next evolutionary step involves equipping large language models with reasoning abilities and the capacity to utilize tools. Anthropic’s ‘Computer Use’ model provided an early glimpse into this future.
Beginning with Claude 3.5 Sonnet in 2024, Anthropic’s model can perceive on-screen activity, including internet content. It can manipulate a cursor, click buttons, and input text. A demonstration video showcased Claude’s ability to complete a form using information available on websites open in browser tabs. It can accomplish tasks like creating a personal website or organizing the logistics of a day trip. The AI’s autonomous actions, such as opening new tabs, conducting searches, and populating data fields, are truly remarkable.
While the model currently operates at a slower pace and may not always produce the correct answer, rapid improvements are anticipated as Anthropic identifies and addresses its limitations. Google’s aforementioned Project Mariner followed Anthropic’s lead in December, and OpenAI introduced its own computer use model, Operator, in January 2025. In February 2025, Anthropic unveiled its next major iteration, Claude 3.7 Sonnet, a larger model capable of automatically engaging reasoning mode for challenging queries. This progression highlights the rapid advancements being made in developing AI models that can interact with and control computer systems.
5. Microsoft
The development of Microsoft’s Phi models stemmed from a fundamental question posed by the company’s researchers in 2023: ‘What is the smallest model size that can exhibit signs of emergent intelligence?’ This inquiry marked a pivotal moment in the evolution of ‘small language models,’ models designed for optimal performance in scenarios with limited memory, processing power, or connectivity, where rapid response times are crucial.
Throughout 2024, Microsoft released two generations of small models that displayed reasoning and logic capabilities not explicitly incorporated during training. In April, the company unveiled a series of Phi-3 models that excelled in language, reasoning, coding, and math benchmarks, likely due to their training on synthetic data generated by significantly larger and more capable LLMs. Variants of the open-source Phi-3 were downloaded over 4.5 million times on Hugging Face during 2024.
In late 2024, Microsoft launched its Phi-4 small language models, which surpassed the Phi-3 models in reasoning-focused tasks and even outperformed OpenAI’s GPT-4o on the GPQA (scientific questions) and MATH benchmarks. Microsoft released the model under an open-source and open-weights license, empowering developers to create edge models or applications for phones or laptops. Within less than a month, Phi-4 garnered 375,000 downloads on Hugging Face. This demonstrates the growing interest in and demand for smaller, more efficient AI models that can be deployed on a wider range of devices.
6. Amazon
Amazon AWS recently introduced Trainium2, a new version of its Trainium processor for AI, potentially challenging the dominance of Nvidia GPUs in specific settings. Trainium2 is engineered to deliver the massive computing power required for training the largest generative AI models and for inference-time operations after model deployment. AWS claims that Trainium is 30% to 40% more cost-effective than GPUs for comparable tasks.
Trainium2 addresses the power and software integration shortcomings observed in the first Trainium chip, positioning Amazon to potentially close the gap with Nvidia. (It’s worth noting that AWS itself remains heavily reliant on Nvidia for GPUs.) Displacing Nvidia is a formidable challenge due to customer lock-in with Nvidia’s CUDA software layer, which provides researchers with granular control over how their models utilize the chip’s resources. Amazon offers its own kernel control software layer, Neuron Kernel Interface (NKI), which, similar to CUDA, grants researchers fine-grained control over chip kernel interactions.
It is important to note that Trainium2 has yet to be tested at scale. AWS is currently constructing a server cluster with 400,000 Trainium2 chips for Anthropic, which could provide valuable insights into optimizing the performance of its AI chips in large-scale deployments. The outcome of this large-scale deployment will be crucial in determining the long-term viability of Trainium2 as a competitor to Nvidia’s GPUs.
7. Arm
The British semiconductor designer Arm has long been a key provider of the architecture used in chips powering small devices like phones, sensors, and IoT hardware. This role takes on heightened significance in the emerging era where edge device chips will execute AI models. Data centers will also play a crucial role in this evolution, often handling some or all of the most demanding AI processing and delivering results to edge devices.
As data centers proliferate globally, their electrical power consumption will become an increasingly pressing concern. This factor contributes to the emphasis on efficiency in Arm’s latest Neoverse CPU architecture. It boasts a 50% performance improvement over previous generations and 20% better performance per watt compared to processors utilizing competing x86 architectures, according to the company.
Arm reports that Amazon, Microsoft, Google, and Oracle have all adopted Arm Neoverse for both general-purpose computing and CPU-based AI inference and training. For instance, in 2024, Microsoft announced that its first custom silicon designed for the cloud, the Cobalt 100 processor, was built on Arm Neoverse. Some of the largest AI data centers will rely on NVIDIA’s Grace Hopper Superchip, which combines a Hopper GPU and a Grace CPU based on Neoverse. Arm is slated to launch its own CPU this year, with Meta as one of its initial customers. This expansion into the data center market positions Arm as a key player in the future of AI infrastructure, both at the edge and in the cloud.
8. Gretel
Over the past year, AI companies have experienced diminishing returns from training their models with ever-increasing volumes of data scraped from the web. Consequently, they have shifted their focus from the sheer quantity of training data to its quality. This has led to increased investment in non-public and specialized content licensed from publisher partners. AI researchers also need to address gaps or blind spots within their human-generated or human-annotated training data. For this purpose, they have increasingly turned to synthetic training data generated by specialized AI models.
Gretel gained prominence in 2024 by specializing in the creation and curation of synthetic training data. The company announced the general availability of its flagship product, Gretel Navigator, which enables developers to use natural language or SQL prompts to generate, augment, edit, and curate synthetic training datasets for fine-tuning and testing. The platform has already attracted a community of over 150,000 developers who have synthesized more than 350 billion pieces of training data.
Other industry players have taken notice of Gretel’s capabilities. Gretel partnered with Google to make its synthetic training data readily accessible to Google Cloud customers. A similar partnership with Databricks was announced in June, granting Databricks’ enterprise customers access to synthetic training data for their models running within the Databricks cloud. These partnerships highlight the growing importance of synthetic data in addressing the limitations of traditional data sources and improving the performance and robustness of AI models.
9. Mistral AI
Mistral AI, France’s contender in the generative AI arena, has consistently exerted pressure on OpenAI, Anthropic, and Google at the forefront of frontier AI model development. Mistral AI released a series of new models incorporating significant technological advancements in 2024, demonstrating rapid business growth through both direct marketing of its APIs and strategic partnerships.
Earlier in the year, the company introduced a pair of open-source models called Mixtral, notable for their innovative use of the ‘mixture of experts’ architecture, where only a specialized subset of the model’s parameters are engaged to handle a query, enhancing efficiency. In July 2024, Mistral announced Mistral Large 2, which, at 123 billion parameters, showcased significant improvements in code generation, math, reasoning, and function calling. The French company also released Ministral 3B and Ministral 8B, smaller models designed for execution on laptops or phones, capable of storing approximately 50 text pages of contextual information provided by the user.
Mistral has achieved success in Europe by positioning itself as a low-cost and flexible alternative to U.S. AI companies like OpenAI. It also continued its expansion into the U.S. enterprise market during 2024. In June, the company secured a $640 million funding round, led by the venture capital firm General Catalyst, raising Mistral’s valuation to approximately $6.2 billion. This rapid growth and significant funding underscore Mistral’s position as a major player in the global AI landscape.
10. Fireworks AI
Fireworks offers a custom runtime environment that streamlines the often-complex engineering work associated with building infrastructure for AI deployments. Using the Fireworks platform, enterprises can integrate any of over 100 AI models and then customize and fine-tune them for their specific use cases.
The company introduced new products during 2024 that will position it to capitalize on key trends in the AI industry. First, developers have become increasingly focused on the responsiveness of AI-powered models and applications. Fireworks debuted FireAttention V2, optimization and quantization software that accelerates model performance and reduces network latency. Second, AI systems are increasingly evolving into ‘pipelines’ that invoke various models and tools via APIs. The new FireFunction V2 software acts as an orchestrator for all components within these increasingly complex systems, particularly as enterprises deploy more autonomous AI applications.
Fireworks reports a 600% increase in revenue growth in 2024. Its customer base includes prominent companies such as Verizon, DoorDash, Uber, Quora, and Upwork. This strong growth and impressive customer list demonstrate the value of Fireworks’ platform in simplifying and accelerating the deployment of AI models and applications.
11. Snorkel AI
Enterprises have come to realize that the effectiveness of their AI systems is directly tied to the quality of their data. Snorkel AI has built a thriving business by assisting enterprises in preparing their proprietary data for use in AI models. The company’s Snorkel Flow AI data development platform provides a cost-efficient method for companies to label and curate their proprietary data, enabling its use in customizing and evaluating AI models for their specific business needs.
In 2024, Snorkel expanded its support to include images, allowing companies to train multimodal AI models and image generators using their own proprietary images. It also incorporated retrieval augmented generation (RAG) into its platform, enabling customers to retrieve only the most relevant segments of information from lengthy documents, such as proprietary knowledge base content, for use in AI training. Snorkel Custom, a new, higher-touch service level, involves Snorkel’s machine learning experts collaborating directly with customers on projects.
Snorkel states that its year-over-year annual bookings doubled during 2024, with triple-digit growth in annual bookings for each of the past three years. Six of the largest banks now utilize Snorkel Flow, according to the company, along with brands like Chubb, Wayfair, and Experian. This strong growth and adoption by major financial institutions highlight the critical role of high-quality data in developing effective and reliable AI systems.
12. CalypsoAI
As AI plays an increasingly crucial role in critical decision-making processes, enterprises are seeking enhanced visibility into the inner workings of models. This need is particularly pronounced in regulated industries that must continuously monitor for bias and other unintended outputs. CalypsoAI was among the first to recognize this emerging requirement and swiftly responded with enhanced explainability features in its AI infrastructure platform.
What sets Calypso apart is the breadth of its observability technology. In 2024, the company launched its AI Security Platform, which safeguards enterprise data by securing, auditing, and monitoring all active generative AI models a company may be using, regardless of the model vendor or whether the model is hosted internally or externally. Calypso also introduced new visualization tools that allow users to observe the logic underlying AI decisions in real time.
The market is responding positively to Calypso’s emphasis on AI observability. The company reports a tenfold increase in revenues during 2024 and anticipates a further fivefold increase in 2025. This rapid growth underscores the growing demand for tools that provide transparency and security for AI deployments, particularly in regulated industries.
13. Galileo
While AI systems exhibit fewer instances of factual hallucinations and biases compared to a year ago, they remain susceptible to these issues. This poses a significant concern for any business utilizing AI, particularly those in regulated sectors like healthcare and banking. AI development teams employ Galileo’s AI platform to measure, optimize, and monitor the accuracy of their models and applications.
In early 2024, following two years of research, Galileo released Luna, a suite of evaluation models trained to identify harmful outputs. These models enable Galileo’s platform to rapidly scrutinize and score an LLM’s work as it assembles the tokens that constitute its response. This process takes approximately 200 milliseconds, allowing sufficient time to flag and prevent the AI’s output from being displayed to a user. While a standard LLM could performthis task, it would be considerably more expensive. Galileo’s purpose-built models offer superior accuracy, cost-efficiency, and, crucially, speed.
Galileo reports a quadrupling of its customer base in 2024, with clients including Twilio, Reddit, Chegg, Comcast, and JPMorgan Chase. The startup also secured a $68 million funding round from investors such as Hugging Face CEO Clément Delangue. This significant growth and investment demonstrate the increasing importance of tools that can detect and mitigate harmful outputs from AI models, ensuring their safe and reliable deployment.
14. Runway
One of the most significant aspirations—and anxieties—surrounding AI is its potential to generate video of sufficient quality to revolutionize the art and economics of filmmaking. The technology made substantial strides toward this future in 2024, with Runway, a New York-based video generation startup, playing a leading role. The release of Runway’s Gen-3 Alpha model in June 2024 garnered widespread acclaim within the AI community for the significantly improved believability of the generated video.
Runway also implemented major enhancements to its tools for controlling the aesthetics of AI video. The model was trained on both images and video and can generate video based on either text or image inputs. The company subsequently released Gen-3 Alpha Turbo, a more cost-efficient and faster version of Gen-3.
Hollywood has been closely monitoring the progress of generative AI, and Runway reports that it has commenced producing custom versions of its models for entertainment industry players. It entered into a formal partnership with Lionsgate Studios in September 2024. Runway developed a custom model for the production company and trained it on Lionsgate’s film catalog. Runway states that the model is intended to assist Lionsgate’s filmmakers, directors, and other creatives in ‘augmenting’ their work while ‘saving time, money, and resources.’ Runway believes its arrangement with Lionsgate could serve as a blueprint for similar collaborations with other production companies. This partnership marks a significant step towards the integration of AI-powered video generation into the mainstream filmmaking process.
15. Cerebras Systems
AI systems, particularly large frontier models, demand immense computing power to operate at scale. This necessitates the interconnection of thousands or millions of chips to distribute the workload. However, the network connections between chips can introduce performance bottlenecks. Cerebras Systems’ technology is designed to harness the speed and efficiency advantages of integrating a vast amount of computing power onto a single, exceptionally large chip.
The company’s latest WSE-3 (third-generation Wafer Scale Engine) chip, for example, measures 814 square millimeters, the size of a dinner plate, and is 56 times larger than Nvidia’s market-leading H100 chips. The chip incorporates a staggering 4 trillion transistors and offers 44 gigabits of memory. These chips can be clustered to form supercomputers, such as Condor Galaxy, a ‘constellation’ of interconnected supercomputers Cerebras is developing in collaboration with its largest customer, G42, a UAE-based AI and cloud computing company.
To date, Cerebras has found a niche in large research organizations, including Mayo Clinic, Sandia National Laboratories, Lawrence Livermore National Laboratory, and Los Alamos National Laboratory. The company filed for an IPO in September 2024. The prospectus indicates that the company’s sales more than tripled to $78.7 million in 2023 and surged to $136.4 million in the first half of 2024. This rapid growth and focus on large-scale computing demonstrate Cerebras’ potential to play a significant role in powering the next generation of AI systems.