How Nvidia dominated AI — and plans to keep it that way as generative AI explodes

At next month’s GTC, Nvidia’s annual AI conference targeting over 3.5 million developers working on its platform, founder and CEO Jensen Huang will invite Ilya Sutskever, cofounder and chief scientist at OpenAI, onto the stage for a fireside chat.

The conversation will certainly send a symbolic message that Nvidia has no intention of ceding its AI dominance — which began when the hardware and software company helped power the deep learning “revolution” of a decade ago. And Nvidia shows few signs of losing its lead as generative AI explodes with tools like ChatGPT.

After all, Nvidia supplied the technology of the early pioneers. Sutskever, along with Alex Krizhevsky and their Ph.D. supervisor Geoffrey Hinton, created AlexNet, the pioneering neural network for computer vision that won the ImageNet competition in October 2012. The winning paper, which showed that the model achieved image-recognition accuracy never before seen, directly led to the next decade’s major AI success stories — everything from Google Photos, Google Translate and Uber to Alexa and AlphaFold.According to Hinton, AlexNet would not have happened without Nvidia. Thanks to their parallel processing capabilities supported by thousands of computing cores, Nvidia’s GPUs — which were created in 1999 for ultrafast 3D graphics in PC video games, but had begun to be optimized for general computing operations — turned out to be perfect for running deep learning algorithms.

“In 2009, I remember giving a talk at NIPS [now NeurIPS] where I told about 1,000 researchers they should all buy GPUs because GPUs are going to be the future of machine learning,” Hinton told VentureBeat last fall.

However, Sutskever also represents the 2023 AI hype explosion, as large language models like ChatGPT and DALL-E 2 have launched generative AI into the public’s consciousness in a way not seen, perhaps, since the dawn of the iPhone in 2007.

This, too, would not have been possible without Nvidia. Today’s massive generative AI models require thousands of GPUs to run — and Nvidia holds about 88% of the GPU market, according to John Peddie Research. In fact, OpenAI reportedly used 10,000 Nvidia GPUs to train ChatGPT.

Many see the 30-year-old Nvidia as the biggest potential winner in the red-hot generative AI space. In fact, the generative AI frenzy sent Nvidia’s share price soaring in January.

Citi, for example, estimated that ChatGPT usage could result in $3 billion to $11 billion in sales for Nvidia over 12 months. And last month, Altimeter’s Brad Gerstner told CNBC that Nvidia is “the beating heart of the AI supercycle.”

On a 4th quarter earnings call with analysts yesterday — on which the company reported increased revenue in its data center business, including AI chips — Nvidia’s Huang agreed.

AI is at an “inflection point,” he said, which is leading to more businesses buying more Nvidia chips to develop ML software.

“Generative AI’s versatility and capability has triggered a sense of urgency at enterprises around the world to develop and deploy AI strategies,” Huang said.

How long can Nvidia’s AI dominance last?

The question is, how long can Nvidia sustain its AI dominance? Will anyone catch up and topple it off its AI perch? Not anytime soon, say experts.

These days, Nvidia is synonymous with AI, says Gartner analyst Chirag Dekate.

“It is not just a GPU computing company, it’s basically an AI supercomputing company,” he explained. “Nvidia has had a complete freedom of the AI landscape, they have taken advantage of really sharp, business-savvy investments and focused approaches that basically enable them to dominate their market.”

Nvidia won’t have free rein forever — chip competitors like AMD and Google are nipping at their heels, for one thing, while geopolitical forces hover ominously. (With the United States’ latest chip export control, state-of-the-art GPUs like Nvidia’s A100 and H100 can no longer be sold to China, for example.)

But Nvidia’s famed platform strategy and software-focused approach is still very, very hard to beat, experts say.

“While other players offer chips and/or systems, Nvidia has built a strong ecosystem that includes the chips, associated hardware and a full stable of software and development systems that are optimized for their chips and systems,” analyst Jack Gold wrote for VentureBeat last September.

Nathan Benaich, founder and general partner of Air Street Capital, pointed out that Nvidia has also been “very nimble” with integrating new capabilities into its system. Other AI chip startups have under-invested in software tooling, so while they have created cloud computing platforms that may be faster or cheaper than Nvidia’s, they “don’t come with a commensurate improvement in the current programming experience.”

Ultimately, he told VentureBeat, the AI game “is Nvidia’s to lose.”

And Nvidia clearly has no intention of losing.

“We know we have the best combined hardware and software platform for being the most efficient at generative AI,” Manuvir Das, VP of enterprise computing at Nvidia, told VentureBeat. But, he added. “We constantly operate with the motto that we have no advantage — and nobody is going to outwork us or out-innovate.”

GPU + CUDA changed the game for AI

Jensen Huang always knew his graphics chips had more potential than just powering the latest video games, but he didn’t anticipate the shift to deep learning, according to a 2016 Forbes interview.

In fact, the success of Nvidia’s GPUs for deep neural networks was “a bizarre, lucky coincidence,” said Sara Hooker, whose 2020 essay “The Hardware Lottery” explored the reasons why various hardware tools succeeded and failed.

Nvidia’s success was like “winning the lottery,” she told VentureBeat last year. Much of it depended upon the “right moment of alignment between progress on the hardware side and progress on the modeling side.” The change, she added, was almost instantaneous. “Overnight, what took 13,000 CPUs overnight took two GPUs,” she said. “That was how dramatic it was.”

Nvidia doesn’t agree with that assessment, however. The company maintains that Nvidia was aware of the potential for GPUs to accelerate neural networks from the mid-2000s, even if they didn’t know that AI was going to be the most important market.

“We did know that the world’s most important problems needed accelerated computing,” said a Nvidia spokesperson. “So we invested heavily in building CUDA [compute unified device architecture] from top to bottom, putting general purpose acceleration into the hands of millions of developers. Adding CUDA to every GPU Nvidia makes was a huge bet.”

The CUDA compute platform, which Nvidia added in 2007, is the software and middleware stack that allows researchers to program and access the compute power and extreme parallelism that GPUs can enable. And the deep learning revolution wouldn’t have happened, experts both inside and outside of Nvidia emphasize, if Nvidia had not added CUDA to the mix in 2007.

Before Nvidia released CUDA, programming a GPU was a long and arduous coding process that required writing a great deal of low-level machine code. Using CUDA — which was free — researchers could develop their deep learning models much more quickly and cheaply. On Nvidia’s hardware, of course.

CUDA, Jensen Huang told Ben Thompson in a March 2022 Stratechery interview, “made GPUs accessible, and because we dedicated ourselves to keeping every generation of processors CUDA-compatible, we invented a new programming model.”

Jensen Huang’s big bet on AI

But six years after CUDA was released, Nvidia was still not yet “all in” on AI.

Bryan Catanzaro, vice president of applied deep learning research at Nvidia, pointed out that when AlexNet was published and other researchers were tinkering with GPUs, “there really wasn’t anybody at Nvidia working on AI.”

Except Catanzaro, that is. At the time, he explained, he was collaborating with Andrew Ng at Stanford on a “little project where we replaced 1,000 servers at Google with three servers using GPUs and a bunch of CUDA kernels that the team wrote.” He was also talking during that period with the NYU AI Lab’s Yann LeCun (now head of AI research at Meta), and Rob Fergus (now a research scientist at DeepMind).

“Fergus was telling me, ‘It’s crazy how many machine learning researchers are spending time writing kernels for the GPU — you should really look into that,’” he said.

Cantanzaro did look into it. Customers were starting to buy large numbers of GPUs for deep learning. Eventually Huang and others at Nvidia took note too.

By 2014 Huang was fully on board with the AI mission. While his 2013 GTC keynote barely mentioned AI, it was suddenly front and center during his 2014 keynote. Machine learning is “one of the most exciting applications in high-performance computing today,” he said. “One of the areas that has seen exciting breakthroughs, enormous breakthroughs, magical breakthroughs, is an area called deep neural nets.”

Catanzaro pointed out that as founder of the firm, Huang had the authority to “turn the company on a dime, he just pounced on it,” realizing that “AI is the future of this company, and we’re gonna bet everything on it.”

Nvidia’s software-focused platform strategy

The ImageNet moment of 2012 involved a few researchers and a GPU. But this was just the first milestone, according to Kari Briski, VP of product management, AI software at Nvidia.

The next challenge was how to make the power of GPUs scale: “We worked on software to make sure that the GPUs could communicate together, so we went from a single GPU to multi-GPU and multinode,” Briski said.

For the past seven years, Nvidia has focused on building deep learning software in libraries and frameworks that abstract the need to have to code in CUDA. It puts CUDA-accelerated libraries, like cuDNN, into more widely used Python-based libraries like PyTorch and TensorFlow.

“You can now scale to hundreds and thousands of GPUs all talking to each other to get that neural network,” said Briski. “It went from months of training to weeks — today it takes seconds to train that same neural network.”

In addition, by 2018 GPUs were used not just for AI training, but for inference, too — to support capabilities in speech recognition, natural language processing, recommender systems and image recognition. That meant not just more hardware, like Nvidia’s T4 chip, but more software to fuel these real-time inference workloads in the data center and in automotive applications, as well as in robots and drones.

As a result, Nvidia has become more of a software company than a hardware company, said Nvidia’s Das. The company hired more and more software engineers and researchers and built the research division to be the state of the art of AI.

“We started building up all these pieces of software, one use case after another,” he said. As standards and frameworks began to evolve, like TensorFlow and PyTorch for training, Nvidia optimized them for GPUs. “We became AI developers and really embraced the ecosystem,” he added.

At Nvidia’s 2022 GTC Analyst/Investor Conference, Huang made the company’s ongoing software and platform focus very clear, including a shout-out to the AI Enterprise Software Suite, which had launched the previous year.

“The important thing about our software is that it’s built on top of our platform,” he said. “It means that it activates all of Nvidia’s hardware chips and system platforms. And secondarily, the software that we do is industry-defining software. We’ve now finally produced a product that an enterprise can license. They’ve been asking for it … they can’t just go to open source, and download all the stuff, and make it work for their enterprise. No more than they could go to Linux, download open-source software, and run a multibillion-dollar company with it.”

The consequence of the platform approach is that anytime a customer buys from Nvidia, they’re not just buying the software, but buying into the Nvidia value chain.

That was another key pillar to Nvidia’s strategy, explained Gartner’s Dekate. With the GPU and CUDA, and Nvidia’s channel strategy, which surrounded customers with options and sourcing that customers are most familiar with, it created an ecosystem growth flywheel.

“Nvidia does not have to try and convince enterprise end users directly,” he said. “End users can use technologies they are familiar with but still turn the crank on Nvidia.”

Nvidia’s AI headwinds are gentle — for now

In the late 2010s, AI chip startups began making waves, from Graphcore and Cerebras to SambaNova.

Analyst Karl Freund recalled that at the time, the prevailing wisdom was that since the startups were designing chips specifically for AI, they would be better than Nvidia’s.

“That didn’t turn out to be the case,” he said. “Nvidia was able to innovate both in their hardware and software to keep their lead.”

That being said, their lead has diminished — Habana Labs, owned by Intel, had really good results on their Habana Gaudi2 chip, Freund pointed out, while Google’s TPU4 “looks really good and is competitive with the A100.”

But Nvidia has the H100 in the wings, which everyone is anxiously waiting to ship in production volumes. The new Hopper H100 chip uses a new architecture designed to be the engine for massively scalable AI infrastructure. It includes a new component called the Transformer Engine that’s specifically optimized for training and inference of the transformer layer, which is the building block of GPT (ChatGPT, for example, is a generative pretrained Transformer).

In addition, even if CUDA’s current competitive moat is challenged, Nvidia is also replacing it with its latest higher-level, use-case specific software — such as AI for healthcare and AI for digital twins/omniverse.

Finally, even if all the competitive trends started materializing in the second half of this year, they would not have a material impact on Nvidia revenues until 2024. Even then, Freund estimated that all competitors combined could get just 10% of the market.

Still, Gartner’s Dekate insists that Nvidia no longer has the clean playing field they once had, that allowed them to dominate the marketplace. That development includes an increased number of customer options, which allows end users to, at the very least, drive pricing advantage in their favor.

Also, with some Chinese vendors having to do without access to Nvidia GPUs, they will try to accelerate competitive technology, he predicted.

Nvidia’s Briski brushes off concerns. “We’ve had headwinds before,” she said. “I think that we’re always challenged to be on our toes, to never feel sort of comfortable.”

In any case, Huang has maintained that it’s tough for competitors to come into the AI market and create a solution that works, isn’t too complicated, and uses the software developers want.

Nvidia’s Huang seen as AI visionary — who keeps the pressure on

Over the years, Nvidia CEO Jensen Huang, well-known for his leather jacket atop a black silicon valley uniform, has been described as everything from “flamboyant” and a “superstar” to, occasionally, a jokester and “the next Steve Jobs.”

But, inevitably, any discussion about Nvidia’s AI success turns to Huang.

Nvidia’s Das joined the company from Microsoft in 2019. He says he had many conversations with Huang over a period of nine months before accepting the position.

“I joined Nvidia to work for Jensen because he just blew my mind in all these conversations … that there could be a person like that, who could think like that, who can actually operate like that,” Das said.

Analyst Freund emphasized that Huang “is an amazing driver of that company” and added that Nvidia does not have a lot of organizational layers, because Huang doesn’t like to have a lot of layers between him and the people doing the engineering work and the science work.

That said, Huang is also demanding, he added. “When I worked at AMD, a lot of my graphics engineers were renegades from Nvidia,” said Freund. “They left Nvidia because they couldn’t handle the pressure.”

Nvidia’s generative AI opportunity

Catanzaro left Nvidia in 2014 to work at Baidu with Andrew Ng, but returned to Nvidia in 2016 to head a new lab focused on applied deep learning research. At the time, he was the only member. Seven years later, he leads a team of 40 researchers.

Nvidia’s accelerated computing business, he said, requires his team to conceive of each entire application as an optimized whole.

“We don’t outsource any of that responsibility,” he said. Nvidia, he explained, tackles entire problems from top to bottom, from chips to applications, algorithms, libraries, compiler framework and interconnected data center architecture. Its researchers have the freedom to really push acceleration “far beyond what we could do if we limited ourselves to thinking about just one part of that stack.”

And with the new era of ChatGPT, Nvidia can push even further, he added.

“Companies are going to be really pushing on the application of AI to lots of different problems,” he said. “That, of course, makes my job even more exciting — I feel like applied research is the hottest place to be right now.”

Huang, too, has weighed in on the transformative moment of ChatGPT. “This is the iPhone moment of artificial intelligence,” he said at a recent Q&A at Berkeley’s Haas School of Business. “This is the time when all those ideas within mobile computing and all that, it all came together in a product that everyone kinda [says], I see it, I see it.”

Nvidia is well-prepared for the generative AI opportunity, said Das. “For those who have been working on it for years, we’ve sort of anticipated this,” he said. “We’ve been working on training large language models and we know what they’re capable of.”

Nvidia is in an AI sweet spot

Since AlexNet in 2012, Nvidia’s AI journey has always been about taking advantage of opportunities that opened up — even if, in the case of GPUs, it was unexpected.

So with the 2023 GTC just a month away — which will include more than 65 sessions focused on generative AI — Nvidia is undoubtedly in a sweet spot. Just as the company’s GPU was at the center of powering the deep learning revolution of a decade ago, Nvidia’s hardware and software are running behind the scenes of today’s GPU-hungry, hyped-up generative AI technology.

And it seems like no matter which companies come out on top — Google? Microsoft? OpenAI? — Nvidia, who supplies them all, will win big.