Configuring the Application-Aware Quota Operator - Managing VMs | Virtualization

About the AAQ Operator
- AAQ Operator controller and custom resources
Enabling the AAQ Operator
Configuring the AAQ Operator by using the CLI
Additional resources

You can use the Application-Aware Quota (AAQ) Operator to customize and manage resource quotas for individual components in an OKD cluster.

About the AAQ Operator

The Application-Aware Quota (AAQ) Operator provides more flexible and extensible quota management compared to the native ResourceQuota object in the OKD platform.

In a multi-tenant cluster environment, where multiple workloads operate on shared infrastructure and resources, using the Kubernetes native ResourceQuota object to limit aggregate CPU and memory consumption presents infrastructure overhead and live migration challenges for OKD Virtualization workloads.

OKD Virtualization requires significant compute resource allocation to handle virtual machine (VM) live migrations and manage VM infrastructure overhead. When upgrading OKD Virtualization, you must migrate VMs to upgrade the virt-launcher pod. However, migrating a VM in the presence of a resource quota can cause the migration, and subsequently the upgrade, to fail.

With AAQ, you can allocate resources for VMs without interfering with cluster-level activities such as upgrades and node maintenance. The AAQ Operator also supports non-compute resources which eliminates the need to manage both the native resource quota and AAQ API objects separately.

AAQ Operator controller and custom resources

The AAQ Operator introduces two new API objects defined as custom resource definitions (CRDs) for managing alternative quota implementations across multiple namespaces:

ApplicationAwareResourceQuota: Sets aggregate quota restrictions enforced per namespace. The ApplicationAwareResourceQuota API is compatible with the native ResourceQuota object and shares the same specification and status definitions.
Example manifest
```
apiVersion: aaq.kubevirt.io/v1alpha1
kind: ApplicationAwareResourceQuota
metadata:
  name: example-resource-quota
spec:
  hard:
    requests.memory: 1Gi
    limits.memory: 1Gi
    requests.cpu/vmi: "1"
    requests.memory/vmi: 1Gi
# ...
```
- spec.hard.requests.cpu/vmi defines the maximum amount of CPU that is allowed for VM workloads in the default namespace.
- spec.hard.requests.memory/vmi defines the maximum amount of RAM that is allowed for VM workloads in the default namespace.
ApplicationAwareClusterResourceQuota: Mirrors the ApplicationAwareResourceQuota object at a cluster scope. It is compatible with the native ClusterResourceQuota API object and shares the same specification and status definitions. When creating an AAQ cluster quota, you can select multiple namespaces based on annotation selection, label selection, or both by editing the spec.selector.labels or spec.selector.annotations fields. You can only create an ApplicationAwareClusterResourceQuota object if the spec.allowApplicationAwareClusterResourceQuota field in the HyperConverged custom resource (CR) is set to true.
Example manifest
```
apiVersion: aaq.kubevirt.io/v1alpha1
kind: ApplicationAwareClusterResourceQuota
metadata:
  name: example-resource-quota
spec:
  quota:
    hard:
      requests.memory: 1Gi
      limits.memory: 1Gi
      requests.cpu/vmi: "1"
      requests.memory/vmi: 1Gi
  selector:
    annotations: null
    labels:
      matchLabels:
        kubernetes.io/metadata.name: default
# ...
```
If both spec.selector.labels and spec.selector.annotations fields are set, only namespaces that match both are selected.

The AAQ controller uses a scheduling gate mechanism to evaluate whether there is enough of a resource available to run a workload. If so, the scheduling gate is removed from the pod and it is considered ready for scheduling. The quota usage status is updated to indicate how much of the quota is used.

If the CPU and memory requests and limits for the workload exceed the enforced quota usage limit, the pod remains in SchedulingGated status until there is enough quota available. The AAQ controller creates an event of type Warning with details on why the quota was exceeded. You can view the event details by using the oc get events command.

Pods that have the spec.nodeName field set to a specific node cannot use namespaces that match the spec.namespaceSelector labels defined in the HyperConverged CR.

Enabling the AAQ Operator

To deploy the AAQ Operator, set the enableApplicationAwareQuota feature gate to true in the HyperConverged custom resource (CR).

Prerequisites

You have access to the cluster as a user with cluster-admin privileges.
You have installed the OpenShift CLI (oc).

Procedure

Set the enableApplicationAwareQuota feature gate to true in the HyperConverged CR by running the following command:

$ oc patch hco kubevirt-hyperconverged -n openshift-cnv \
 --type json -p '[{"op": "add", "path": "/spec/featureGates/enableApplicationAwareQuota", "value": true}]'

Configuring the AAQ Operator by using the CLI

You can configure the AAQ Operator by specifying the fields of the spec.applicationAwareConfig object in the HyperConverged custom resource (CR).