AMD’s Ryzen AI Benchmarks: A Closer Look
AMD recently showcased the AI performance of its Ryzen AI Max+ 395 chipset, found in devices like the Asus ROG Flow Z13 (2025). The benchmarks presented pitted the Ryzen chip against Intel’s Core Ultra 7 258V (Lunar Lake), featured in the Asus Zenbook S14 (UX5406). As one might expect, Intel’s mid-tier Lunar Lake processor didn’t match the raw power of the Ryzen AI Max Strix Halo APU, especially in GPU-accelerated AI workloads. However, these initial comparisons focused primarily on the AMD-versus-Intel rivalry, neglecting a crucial competitor: Apple. This analysis provides a broader perspective by including Apple’s silicon in the comparison.
AMD’s benchmarking methodology departs from typical industry-standard tests. Instead of relying on established benchmarks, AMD uses a “tokens per second” metric. This metric quantifies how Lunar Lake and Strix Halo handle various Large Language Model (LLM) and Small Language Model (SLM) AI frameworks, including DeepSeek and Microsoft’s Phi 4.
Predictably, the robust GPU within the Ryzen AI Max+ 395 significantly outperformed the smaller Intel Arc 140V integrated graphics in Lunar Lake. This outcome is unsurprising, given that Intel’s Lunar Lake chips are designed for ultra-portable AI PC laptops, operating at a much lower power envelope than the Ryzen AI Max+. Expecting comparable GPU performance from an ultra-thin notebook versus a gaming-oriented machine like the Flow Z13 is unrealistic.
A Question of Fair Comparison
While both the AMD Ryzen AI Max+ 395 and the Intel Core Ultra 200V series are x86 CPUs capable of handling AI workloads, comparing the Zenbook S14 and the ROG Flow Z13 is somewhat flawed. They are fundamentally different devices, built with different hardware, and designed for entirely distinct use cases. It’s akin to comparing the gaming performance of a handheld console to a high-end gaming desktop.
It’s also important to remember that AMD already has direct competitors to Lunar Lake in its Strix Point and Krackan Point Ryzen AI 300 series. These processors are more appropriately positioned to compete with Intel’s offerings in the ultraportable space.
Validating AMD’s Claims and Introducing Apple
Because AMD’s performance benchmarks lack standardized tests and concrete score numbers, we cross-referenced their findings with our own lab benchmarks.
AMD’s assertion of the “Most Powerful x86 processor for LLMs” appears to hold true for the Strix Halo. However, it’s crucial to understand that Strix Halo represents a departure from traditional mobile CPU design. It shares more architectural similarities with Apple’s Arm-based M4 Max or M3 Ultra. This sets up an x86 versus Arm comparison, where Apple’s high-end chipsets fall into a similar CPU class as the Ryzen AI Max+, a category where Lunar Lake is not a direct competitor.
While we don’t currently have benchmark data for the M4 Max or M3 Ultra, we do have test results from the “most powerful Apple laptop we’ve ever tested,” the MacBook Pro 16 equipped with the M4 Pro chipset.
A More Appropriate Comparison: HP ZBook 14 Ultra vs. MacBook Pro 16
Ideally, for a more direct chip-to-chip and product-to-product comparison, the other launch system for the Ryzen AI Max APU, the HP ZBook 14 Ultra, would have been a more suitable contender against the MacBook Pro. Apple’s premium laptops have long been a benchmark for creative professionals, making the HP ZBook 14 Ultra, a workstation-class laptop, a compelling test subject against the MacBook Pro 16.
Unfortunately, we haven’t yet had the opportunity to test the ZBook 14 Ultra G1a. Therefore, we used the Flow Z13 for this comparison, acknowledging its gaming-focused design.
Verifying AMD’s Claims with the Asus Zenbook S14
We included the Intel Core Ultra 7 258V-powered Asus Zenbook S14 in the comparison to validate AMD’s claims. As anticipated, the Zenbook S14 occupied the lower end of the performance spectrum compared to the Apple and AMD powerhouses, confirming AMD’s general positioning.
Geekbench AI Benchmark: A Cross-Platform View
While the Ryzen AI Max+ 395 in the ROG Flow Z13 demonstrates a clear advantage in gaming performance, the M4 Pro offers surprisingly strong competition in GPU-intensive AI tasks, as evidenced by the Geekbench AI benchmark.
Although the Geekbench AI benchmark has limitations in comprehensively measuring AI performance, it serves as a cross-platform benchmark designed for comparing CPUs and GPUs. This contrasts with AMD’s reported “Tokens per second” benchmarks, which are more challenging to replicate and verify independently. Geekbench AI provides a standardized, albeit imperfect, point of comparison.
The Ryzen AI Max+ 395: A Powerful Contender
The strong showing of the Apple MacBook Pro 16 against the Flow Z13 in our benchmarks doesn’t diminish the fact that the Ryzen AI Max+ 395 is an exceptionally powerful chipset. It’s a high-performance, versatile chip that has demonstrated impressive results in both creative and gaming workloads. It represents a novel approach to x86 processor design, pushing the boundaries of what’s possible in a mobile form factor.
Its performance in the ROG Flow Z13 is impressive, and we look forward to testing the PRO version in the HP ZBook 14 Ultra. We also hope to see AMD integrate the Ryzen AI Max into a wider range of systems, providing more opportunities for diverse benchmark comparisons and real-world usage scenarios.
The Need for Stronger Competition
The emergence of powerful processors like the Ryzen AI Max+ 395 underscores the ongoing need for robust competition in the high-end chipset market. Apple Silicon, while impressive, benefits from strong rivals that push the boundaries of performance and innovation. The comparisons, while complex due to differing architectures and design philosophies, show that the landscape is shifting. The traditional x86 architecture is evolving to meet the demands of AI-driven workloads, and Arm-based designs are proving their capabilities in high-performance computing.
Deeper Dive: “Tokens per Second” and its Limitations
AMD’s reliance on “tokens per second” as a primary performance metric warrants further discussion. While it provides a measure of processing speed for language models, it doesn’t fully capture the complexities of AI performance. Several other factors are equally crucial:
- Model Accuracy: A high “tokens per second” rate is meaningless if the model’s output is inaccurate or unreliable. Accuracy is paramount in real-world AI applications.
- Latency: The time it takes for a model to respond to a query (latency) is critical for user experience. A fast token generation rate doesn’t guarantee low latency.
- Power Efficiency: Especially in mobile devices, power consumption is a major concern. A chip that generates tokens quickly but drains the battery rapidly is not ideal.
- Model Diversity: AMD’s testing focused on specific LLMs and SLMs (DeepSeek and Phi 4). Performance on these models might not be representative of performance on other popular models used in various AI tasks. A more comprehensive evaluation would involve a broader range of models and benchmarks.
- Quantization: The precision used for calculations (e.g., FP32, FP16, INT8) significantly impacts both performance and accuracy. Different chips may handle different quantization levels with varying efficiency.
Therefore, while “tokens per second” is a useful metric, it should be considered alongside other performance indicators to provide a complete picture of AI capabilities.
The Significance of Integrated Graphics
The performance disparity between the Ryzen AI Max+ 395 and the Intel Core Ultra 7 258V is largely due to the difference in integrated graphics capabilities. The Ryzen chip boasts a significantly more powerful GPU, which is particularly advantageous for AI workloads that can leverage GPU acceleration. Many AI tasks, especially those involving neural networks, are highly parallelizable and benefit greatly from the massively parallel processing capabilities of GPUs.
However, it’s important to acknowledge that integrated graphics, even in high-end chips like the Ryzen AI Max+, still have limitations compared to discrete GPUs. For the most demanding AI tasks, such as training large language models, a dedicated graphics card remains the preferred solution. The comparison highlights the growing importance of integrated graphics for AI processing, particularly in mobile and power-constrained devices, but it shouldn’t be interpreted as a complete replacement for discrete GPUs in all scenarios.
The x86 vs. Arm Architecture Debate
The comparison between the Ryzen AI Max+ (x86) and the Apple M4 Pro (Arm) touches upon the ongoing debate surrounding these two processor architectures. Historically, x86 has dominated the PC market, while Arm has been prevalent in mobile devices. However, the lines are becoming increasingly blurred.
- x86: Traditionally associated with high performance and broad software compatibility. x86 processors have a long history and a vast ecosystem of software developed for them.
- Arm: Often praised for its power efficiency, making it ideal for battery-powered devices. Arm’s architecture is designed with power consumption in mind.
The Ryzen AI Max+ demonstrates that x86 can be adapted for power-efficient designs, challenging the notion that x86 is inherently less power-efficient than Arm. Conversely, Apple’s M-series chips have proven that Arm can deliver impressive performance, rivaling and even surpassing x86 chips in certain workloads.
The choice between x86 and Arm is becoming less about inherent architectural advantages and more about specific implementations and optimizations. Both architectures are evolving, and the competition between them is driving innovation.
The Crucial Role of Software Optimization
Hardware capabilities are only one piece of the puzzle. Software optimization plays a critical role in maximizing AI performance. Both AMD and Apple invest heavily in software ecosystems tailored to their respective hardware platforms.
- AMD ROCm: AMD’s ROCm (Radeon Open Compute platform) provides a suite of tools and libraries for developing and deploying AI applications on AMD GPUs. It includes optimized compilers, libraries, and runtimes for accelerating AI workloads.
- Apple Core ML: Apple’s Core ML framework offers similar capabilities for Apple silicon. It allows developers to integrate machine learning models into their apps and optimize them for Apple’s hardware.
The effectiveness of these software stacks can significantly impact real-world AI performance. A less powerful chip could potentially outperform a more powerful one if it benefits from superior software optimization and a more mature software ecosystem. A comprehensive comparison should consider not only raw hardware specifications but also the level of software support and optimization available for each platform.
Future Trends and Predictions
The rapid advancements in AI are driving continuous innovation in processor design. Several key trends are likely to shape the future of AI processing:
- Specialized AI Accelerators: We can expect to see even more specialized AI accelerators integrated into future chips. These dedicated hardware units will be designed to handle specific AI tasks, such as neural network inference, with maximum efficiency.
- Heterogeneous Computing: The integration of CPUs, GPUs, and AI accelerators into a single chip (heterogeneous computing) will become increasingly common. This allows for optimal performance and power efficiency by assigning different tasks to the most suitable processing unit.
- Memory Innovations: Memory bandwidth and capacity are crucial for AI performance. New memory technologies, such as High Bandwidth Memory (HBM) and on-package memory, will continue to evolve to meet the growing demands of AI workloads.
- Software-Hardware Co-design: The tight integration of hardware and software will become even more critical. Chip designers and software developers will work closely together to optimize performance and efficiency.
- Edge AI: The trend towards processing AI workloads at the edge (on devices rather than in the cloud) will drive the development of power-efficient AI chips for mobile devices, IoT devices, and other edge applications.
The competition between AMD, Intel, Apple, and other players in the AI chip market will intensify, leading to faster, more power-efficient, and more AI-capable processors. This competition will ultimately benefit consumers and drive the adoption of AI across a wider range of applications. The ongoing development of new benchmarks and testing methodologies will also be essential for accurately evaluating the performance of these increasingly complex systems. The race is on to create the ultimate AI processing solution, and the coming years promise exciting advancements. The constant improvements in neural processing and dedicated AI hardware will likely lead to a paradigm shift in how we interact with technology, making AI more pervasive and integrated into our daily lives.