Home Stocks Nvidia’s New Chips Slash AI Training Times, Benchmark Data Finds

Nvidia’s New Chips Slash AI Training Times, Benchmark Data Finds

353
0

Nvidia’s latest chips have significantly improved performance in training large-scale artificial intelligence systems, with newly released data on Wednesday showing a substantial reduction in the number of chips needed to train expansive language models.

MLCommons, a nonprofit organization that tracks and publishes AI benchmark results, shared new data on training performance for chips made by Nvidia (NASDAQ: NVDA), AMD (NASDAQ: AMD), and others. Training refers to the process where AI systems learn by processing vast datasets. Although investor focus has recently shifted to AI inference—where trained systems respond to queries—training efficiency remains a critical area of competition. For example, Chinese firm DeepSeek claims it can develop a capable chatbot using far fewer chips than American competitors.

This is the first time MLCommons has released benchmark data for training AI models as large as Llama 3.1 405B, an open-source model developed by Meta Platforms (NASDAQ: META). This model has hundreds of billions of parameters, making it a strong test of a chip’s training capability—especially for tasks involving trillions of parameters.

Only Nvidia and its partners submitted results for training such a large model, and the findings revealed that Nvidia’s latest Blackwell chips are over twice as fast per chip compared to the older Hopper generation.

In the top-performing test, 2,496 Blackwell chips completed the training in just 27 minutes—less than a third of the number of previous-gen chips needed to achieve even faster times, according to the report.

Chetan Kapoor, Chief Product Officer at CoreWeave, which partnered with Nvidia for the benchmarks, said in a press briefing that the AI industry is increasingly moving toward using smaller, modular chip groups for distinct training tasks, rather than massive clusters of 100,000 or more identical chips.

He explained, “This approach allows them to keep reducing training time for these extremely complex models with multi-trillion parameters.”