AI Capacity Hunger Drives Spending Despite Efficiency

The Initial Tremor: DeepSeek and the Efficiency Mirage

The emergence of China’s DeepSeek AI earlier this year sent shockwaves through the tech investment landscape. Its seemingly groundbreaking approach, promising powerful artificial intelligence with significantly lower computational overhead, sparked immediate speculation. A narrative quickly formed: perhaps the relentless, costly expansion of AI infrastructure, characterized by massive purchases of specialized chips and systems, was about to decelerate. The market reacted, reflecting a belief that a new era of cost-effective AI might dramatically curtail the anticipated spending boom.

However, insights from a recent high-level gathering of industry minds paint a starkly different picture. A generative AI conference convened in New York by Bloomberg Intelligence suggested that the initial interpretation, focused solely on potential cost savings, missed the bigger story. Far from signaling a spending slowdown, the event underscored an almost insatiable hunger for greater AI capacity. The consensus wasn’t about cutting back; it was about figuring out how to feed an exponentially growing appetite for intelligent systems, even while desperately wishing the menu were less expensive.

Voices from the Trenches: An Unquenchable Thirst for Capacity

The discussions throughout the daylong event, which brought together developers, strategists, and investors, consistently circled back to the theme of escalating demand driving monumental investment. Mandeep Singh, a senior technology analyst with Bloomberg Intelligence and one of the event’s organizers, captured the prevailing sentiment succinctly. Reflecting on the numerous panels and expert discussions, he noted a universal refrain: no one involved felt they possessed sufficient AI capacity. The overwhelming feeling was one of needing more, not having too much.

Crucially, Singh added, the spectre of an “infrastructure bubble,” a common fear in rapidly expanding tech sectors, was notably absent from the conversation. The focus remained squarely on the foundational challenge facing the entire industry. Anurag Rana, Singh’s colleague and Bloomberg Intelligence’s senior analyst for IT services and software, framed it as the paramount question: “Where are we in that [AI infrastructure build] cycle?”

While acknowledging that pinpointing the exact stage of this massive build-out remains elusive (“Nobody knows” for certain, Rana admitted), the DeepSeek phenomenon undeniably shifted perspectives. It injected a potent dose of hope that significant AI workloads could potentially be handled more economically. “DeepSeek shook a lot of people,” Rana observed. The implication was clear: if sophisticated AI models could indeed run efficiently on less demanding hardware, perhaps gargantuan projects, like the multi-hundred-billion-dollar initiatives rumored to be planned by consortia involving major tech players, might be re-evaluated or scaled differently.

The dream, echoed across the industry according to Rana, is for AI operational costs, particularly for inference (the stage where trained models generate predictions or content), to follow the dramatic downward trajectory witnessed in cloud computing storage over the past decade. He recalled how the economics of storing vast amounts of data on platforms like Amazon Web Services (AWS) improved dramatically over roughly eight years. “That drop in the cost curve… the economics were good,” he stated. “And that’s what everybody’s hoping, that on the inference side… if the curve falls to that level, oh my god, the adoption rate on AI… is going to be spectacular.” Singh concurred, noting that DeepSeek’s arrival has fundamentally “changed everyone’s mindset about achieving efficiency.”

This yearning for efficiency was palpable throughout the conference sessions. While numerous panels delved into the practicalities of moving enterprise AI projects from conceptual stages into live production, a parallel discussion constantly emphasized the critical need to slash the costs associated with deploying and running these AI models. The goal is clear: democratize access by making AI economically viable for a broader range of applications and users. Shawn Edwards, Bloomberg’s own chief technologist, suggested that DeepSeek wasn’t necessarily a complete surprise, but rather a powerful illustration of a universal desire. “What it made me think is that it would be great if you could wave a wand and have these models run incredibly efficiently,” he remarked, extending the wish to the entire spectrum of AI models, not just one specific breakthrough.

The Proliferation Principle: Fueling the Compute Demand

One of the primary reasons experts anticipate continued, substantial investment in AI infrastructure, despite the quest for efficiency, lies in the sheer proliferation of AI models. A recurring theme throughout the New York conference was the decisive move away from the notion of a single, monolithic AI model capable of handling all tasks.

  • A Family Affair: As Bloomberg’s Edwards put it, “We use a family of models. There is no such thing as a best model.” This reflects a growing understanding that different AI architectures excel at different tasks – language generation, data analysis, image recognition, code completion, and so on.
  • Enterprise Customization: Panelists widely agreed that while large, general-purpose “foundation” or “frontier” models will continue to be developed and refined by major AI labs, the real action within businesses involves deploying potentially hundreds or even thousands of specialized AI models.
  • Fine-Tuning and Proprietary Data: Many of these enterprise models will be adapted from base models through a process called fine-tuning. This involves retraining a pre-trained neural network on a company’s specific, often proprietary, data. This allows the AI to understand unique business contexts, terminology, and customer interactions, delivering far more relevant and valuable results than a generic model could.
  • Democratizing Development: Jed Dougherty, representing the data science platform Dataiku, highlighted the need for “optionality among the models” for enterprise AI agents. He stressed the importance of giving companies control, creation capabilities, and auditability over their AI tools. “We want to put the tools to build these things in the hands of people,” Dougherty asserted. “We don’t want ten PhDs building all the agents.” This drive towards broader accessibility in development itself implies a need for more underlying infrastructure to support these distributed creation efforts.
  • Brand-Specific AI: The creative industries offer a prime example. Hannah Elsakr, leading new business ventures at Adobe, explained their strategy betting on custom models as a key differentiator. “We can train custom model extensions for your brand that can be a help for a new ad campaign,” she illustrated, showcasing how AI can be tailored to maintain specific brand aesthetics and messaging.

Beyond the diversification of models, the increasing deployment of AI agents within corporate workflows is another significant driver of processing demand. These agents are envisioned not just as passive tools but as active participants capable of executing multi-step tasks.

Ray Smith, heading Microsoft’s Copilot Studio agents and automation efforts, predicted a future where users interact with potentially hundreds of specialized agents through a unified interface like Copilot. “You won’t cram a whole process into one agent, you’ll break it up into parts,” he explained. These agents, he suggested, are essentially “apps in the new world” of programming. The vision is one where users simply state their goal – “tell it what we want to accomplish” – and the agent orchestrates the necessary steps. “Agentic apps are just a new way of workflow,” Smith stated, emphasizing that realizing this vision is less a matter of technological possibility (“it’s all technologically possible”) and more about the “pace at which we build it out.”

This push to embed AI agents more deeply into everyday organizational processes further intensifies the pressure for cost reduction and efficient deployment. James McNiven, head of product management at microprocessor giant ARM Holdings, framed the challenge in terms of accessibility. “How do we provide access on more and more devices?” he pondered. Observing models achieving near “PhD-level” capabilities in specific tasks, he drew a parallel to the transformative impact of bringing mobile payment systems to developing nations years ago. The core question remains: “How do we get that [AI capability] to people who can use that ability?” Making sophisticated AI agents readily available as assistants to a broad swathe of the workforce necessitates not only clever software but also efficient hardware and, inevitably, more underlying infrastructure investment, even as efficiency per computation improves.

Scaling Hurdles: Silicon, Power, and the Cloud Behemoths

Even the most widely used, generic foundation models are multiplying at a staggering pace, placing immense strain on existing infrastructure. Dave Brown, who oversees computing and networking for Amazon Web Services (AWS), revealed that their platform alone offers customers access to around 1,800 different AI models. He underscored AWS’s intense focus on “doing a lot to bring down the cost” of running these powerful tools.

A key strategy for cloud providers like AWS involves developing their own custom silicon. Brown highlighted the increasing use of AWS-designed chips, such as their Trainium processors optimized for AI training, stating, “AWS is using more of our own processors than other companies’ processors.” This move towards specialized, in-house hardware aims to wrestle control over performance and cost, reducing reliance on general-purpose chip suppliers like Nvidia, AMD, and Intel. Despite these efforts, Brown candidly acknowledged the fundamental reality: “Customers would do more if the cost was lower.” The demand ceiling is currently defined more by budget constraints than by a lack of potential applications.

The scale of resources required by leading AI developers is immense. Brown noted AWS’s daily collaboration with Anthropic, the creators of the sophisticated Claude family of language models. Michael Gerstenhaber, Anthropic’s head of application programming interfaces, speaking alongside Brown, pointed out the computational intensity of modern AI, particularly models designed for complex reasoning or “thinking.” These models often generate detailed step-by-step explanations for their answers, consuming significant processing power. “Thinking models cause a lot of capacity to be used,” Gerstenhaber stated.

While Anthropic actively works with AWS on optimization techniques like “prompt caching” (storing and reusing computations from previous interactions to save resources), the fundamental hardware requirement remains enormous. Gerstenhaber bluntly stated that Anthropic needs “hundreds of thousands of accelerators” – specialized AI chips – distributed “across many data centers” simply to run its current suite of models. This provides a concrete sense of the sheer scale of compute resources underpinning just one major AI player.

Compounding the challenge of procuring and managing vast fleets of silicon is the spiraling energy consumption associated with AI. Brown highlighted this as a critical, and rapidly escalating, concern. Current data centers supporting intensive AI workloads are already consuming power measured in hundreds of megawatts. Projections suggest future requirements will inevitably climb into the gigawatt range – the output of large power plants. “The power it consumes,” Brown warned, referring to AI, “is large, and the footprint is large in many data centers.” This escalating energy demand presents not only immense operational costs but also significant environmental and logistical challenges for siting and powering the next generation of AI infrastructure.

The Economic Wildcard: A Shadow Over Growth Plans

Despite the bullish outlook driven by technological advancements and burgeoning use cases, a significant variable looms over all projections for AI investment: the broader economic climate. As the Bloomberg Intelligence conference concluded, attendees were already observing market jitters stemming from newly announced global tariff packages, perceived as more extensive than anticipated.

This serves as a potent reminder that ambitious technological roadmaps can be swiftly disrupted by macroeconomic headwinds. Bloomberg’s Rana cautioned that while AI spending might be somewhat insulated initially, traditional areas of corporate IT investment, such as servers and storage unrelated to AI, could be the first casualties in an economic contraction. “The other big thing we are focused on is the non-AI tech spending,” he noted, expressing concern about the potential impact on major tech service providers heading into earnings season, even before considering AI budgets specifically.

There’s a prevailing theory, however, that AI might prove uniquely resilient. Rana suggested that Chief Financial Officers (CFOs) at major corporations, facing budget constraints due to economic uncertainty or even a recession, might choose to prioritize AI initiatives. They could potentially shift funds away from less critical areas to protect strategic AI investments perceived as crucial for future competitiveness.

Yet, this optimistic view is far from guaranteed. The ultimate test, according to Rana, will be whether large corporations maintain their aggressive capital expenditure targets, particularly for building out AI data centers, in the face of mounting economic uncertainty. The critical question remains: “Are they going to say, ‘You know what? It’s too uncertain.’” The answer will determine whether the seemingly unstoppable momentum behind AI infrastructure spending continues its relentless climb or faces an unexpected pause dictated by global economic realities.