The NVIDIA GPU Operator manages NVIDIA GPU resources in a OKD cluster and automates tasks related to bootstrapping GPU nodes. Because the GPU is a special resource in the cluster, you must install some components before you can deploy application workloads to the GPU. These components include the NVIDIA drivers that enable the compute unified device architecture (CUDA), Kubernetes device plugin, container runtime, and other features such as automatic node labeling, monitoring, and more.
|
The NVIDIA GPU Operator is supported only by NVIDIA. For more information about obtaining support from NVIDIA, see Obtaining Support from NVIDIA.
|
There are two ways to enable GPUs with OKD OKD Virtualization: the OKD-native way described here and by using the NVIDIA GPU Operator.
The NVIDIA GPU Operator is a Kubernetes Operator that uses OKD OKD Virtualization to provision GPUs for virtualized workloads running on OKD. With the Operator, you can easily provision and manage GPU-enabled virtual machines to run complex artificial intelligence/machine learning (AI/ML) workloads on the same platform as their other workloads. The Operator also provides an easy way to scale the GPU capacity of their infrastructure, enabling rapid growth of GPU-based workloads.