Nvidia Launches Software to Speed Up AI Models on Its GPUs Amid Generative AI Boom

Nvidia announced TensorRT-LLM SDK to accelerate large language models like Stable Diffusion on Windows and Nvidia GPUs.
TensorRT-LLM allows models to run faster inference on Nvidia's new H100 GPUs.
This positions Nvidia as a provider of both GPU hardware and optimization software for generative AI.
Demand for Nvidia GPUs has skyrocketed amid the generative AI boom, with estimated $40,000 per H100 chip.
But competitors like AMD, Microsoft, and startups aim to reduce reliance on Nvidia, so it is expanding its software offerings.