Posted 4/3/2024, 9:35:00 AM
Text-to-Video Demand Could Require $21B in GPUs, Straining Nvidia Production Capacity
- 720,000 high-end Nvidia GPUs estimated to be required to support text-to-video for TikTok and YouTube creator community
- Sora AI model alone requires 10,500 GPUs to train and can only generate 5 mins of video per GPU per hour
- Inference (generating new videos) will require more compute power than initial model training as adoption grows
- Nvidia shipped 550,000 H100 GPUs in 2023; top 12 customers have 650,000 - Meta and Microsoft have 300,000
- Cost of required GPUs would be $21.6 billion - nearly the entire market cap of AI tokens