NVIDIA and Oracle Team Up to Offer Faster, Cheaper AI Inference on the Cloud
-
NVIDIA Triton inference server provides flexibility to build and run AI apps efficiently on Oracle Cloud Infrastructure. It reduced costs by 10% and increased throughput up to 76%.
-
Oracle Cloud Infrastructure Vision AI is used by customers for object detection and image classification, like automated toll billing and invoice recognition.
-
Triton is being adopted across other Oracle services to provide an easy way for customers to leverage its capabilities.
-
Triton handles concurrent multi-model inference very effectively compared to other frameworks.
-
Oracle is exploring using TensorRT-LLM to accelerate large language models on NVIDIA GPUs, finding a balance between precision and performance.