×

You can configure an OKD cluster to run real-time virtual machine (VM) workloads that require low and predictable latency. OKD provides the Node Tuning Operator to implement automatic tuning for real-time and low latency workloads.

Configuring a cluster for real-time workloads

You can configure an OKD cluster to run real-time workloads.

Prerequisites
  • You have access to the cluster as a user with cluster-admin permissions.

  • You have installed the OpenShift CLI (oc).

  • You have installed the Node Tuning Operator.

Procedure
  1. Label a subset of the compute nodes with a custom role, for example, worker-realtime:

    $ oc label node <node_name> node-role.kubernetes.io/worker-realtime=""

    You must use the default master role for single-node OpenShift and compact clusters.

  2. Create a new MachineConfigPool manifest that contains the worker-realtime label in the spec.machineConfigSelector object:

    Example MachineConfigPool manifest
    apiVersion: machineconfiguration.openshift.io/v1
    kind: MachineConfigPool
    metadata:
      name: worker-realtime
      labels:
        machineconfiguration.openshift.io/role: worker-realtime
    spec:
      machineConfigSelector:
        matchExpressions:
          - key: machineconfiguration.openshift.io/role
            operator: In
            values:
              - worker
              - worker-realtime
      nodeSelector:
        matchLabels:
          node-role.kubernetes.io/worker-realtime: ""

    You do not need to create a new MachineConfigPool manifest for single-node OpenShift and compact clusters.

  3. If you created a new MachineConfigPool manifest in step 2, apply it to the cluster by using the following command:

    $ oc apply -f <real_time_mcp>.yaml
  4. Create a PerformanceProfile manifest that applies to the labeled nodes and the machine config pool that you created in the previous steps:

    Example PerformanceProfile manifest
    apiVersion: performance.openshift.io/v2
    kind: PerformanceProfile
    metadata:
      name: profile-1
    spec:
      cpu:
        isolated: 4-39,44-79
        reserved: 0-3,40-43
      globallyDisableIrqLoadBalancing: true
      hugepages:
        defaultHugepagesSize: 1G
        pages:
        - count: 8
          size: 1G
      realTimeKernel:
        enabled: true
      workloadHints:
        highPowerConsumption: true
        realTime: true
      nodeSelector:
        node-role.kubernetes.io/worker-realtime: ""
      numa:
        topologyPolicy: single-numa-node
  5. Apply the PerformanceProfile manifest:

    $ oc apply -f <real_time_pp>.yaml

    The compute nodes automatically reboot twice after you apply the MachineConfigPool and PerformanceProfile manifests. This process might take a long time.

  6. Retrieve the name of the generated RuntimeClass resource from the status.runtimeClass field of the PerformanceProfile object:

    $ oc get performanceprofiles.performance.openshift.io profile-1 -o=jsonpath='{.status.runtimeClass}{"\n"}'
  7. Set the previously obtained RuntimeClass name as the default container runtime class for the virt-launcher pods by editing the HyperConverged custom resource (CR):

    $ oc patch hyperconverged kubevirt-hyperconverged -n kubevirt-hyperconverged \
        --type='json' -p='[{"op": "add", "path": "/spec/defaultRuntimeClass", "value":"<runtimeclass_name>"}]'

    Editing the HyperConverged CR changes a global setting that affects all VMs that are created after the change is applied.

  8. If your real-time-enabled compute nodes use simultaneous multithreading (SMT), enable the alignCPUs feature gate by editing the HyperConverged CR:

    $ oc patch hyperconverged kubevirt-hyperconverged -n kubevirt-hyperconverged \
        --type='json' -p='[{"op": "replace", "path": "/spec/featureGates/alignCPUs", "value": true}]'

    Enabling alignCPUs allows OKD Virtualization to request up to two additional dedicated CPUs to bring the total CPU count to an even parity when using emulator thread isolation.

Configuring a virtual machine for real-time workloads

You can configure a virtual machine (VM) to run real-time workloads.

Prerequisites
  • Your cluster is configured to run real-time workloads.

  • You have installed the virtctl tool.

Procedure
  1. Create a VirtualMachine manifest to include information about CPU topology, CRI-O annotations, and huge pages:

    Example VirtualMachine manifest
    apiVersion: kubevirt.io/v1
    kind: VirtualMachine
    metadata:
      name: realtime-vm
    spec:
      running: true
      template:
        metadata:
          annotations:
            cpu-load-balancing.crio.io: disable (1)
            cpu-quota.crio.io: disable (2)
            irq-load-balancing.crio.io: disable (3)
        spec:
          domain:
            cpu:
              dedicatedCpuPlacement: true
              isolateEmulatorThread: true
              model: host-passthrough
              numa:
                guestMappingPassthrough: {}
              realtime: {}
              sockets: 1 (4)
              cores: 4 (5)
              threads: 1
            devices:
              autoattachGraphicsDevice: false
              autoattachMemBalloon: false
              autoattachSerialConsole: true
            ioThreadsPolicy: auto
            memory:
              guest: 4Gi
              hugepages:
                pageSize: 1Gi (6)
            terminationGracePeriodSeconds: 0
    # ...
    1 This annotation specifies that load balancing is disabled for CPUs that are used by the container.
    2 This annotation specifies that the CPU quota is disabled for CPUs that are used by the container.
    3 This annotation specifies that interrupt request (IRQ) load balancing is disabled for CPUs that are used by the container.
    4 The number of sockets inside the VM.
    5 The number of cores inside the VM. This must be a value greater than or equal to 1.
    6 The size of the huge pages. The possible values for x86-64 architectures are 1Gi and 2Mi. In this example, the request is for 4 huge pages of size 1 Gi.
  2. Apply the VirtualMachine manifest:

    $ oc apply -f <file_name>.yaml
  3. Configure the guest operating system. The following example shows the configuration steps for a Fedora 9 operating system:

    1. Run the following command to connect to the VM console:

      $ virtctl console <vm_name>
    2. If you are not already logged in as a root user, switch to the root user account to execute the remaining configuration steps.

    3. Disable the irqbalance service by using the following command:

      # systemctl disable irqbalance && systemctl stop irqbalance
    4. Enable the real-time and network function virtualization (NFV) repositories:

      # subscription-manager repos --enable rhel-9-for-x86_64-rt-rpms --enable rhel-9-for-x86_64-nfv-rpms
    5. Install the necessary packages by running the following command:

      # dnf install tuned cloud-init
    6. To achieve low-latency tuning by using the realtime-virtual-guest profile in the TuneD application, run the following commands:

      # dnf install kernel-rt realtime-tests tuned-profiles-realtime
      # dnf install tuned-profiles-nfv-guest
    7. Edit the /etc/tuned/realtime-virtual-guest-variables.conf file:

      #
      # Variable settings below override the definitions from the
      # /etc/tuned/realtime-variables.conf file.
      
      #
      # Core isolation
      #
      # The 'isolated_cores=' variable below controls which cores should be
      # isolated. By default we reserve 1 core per socket for housekeeping
      # and isolate the rest. But you can isolate any range as shown in the
      # examples below. Just remember to keep only one isolated_cores= line.
      #
      # Examples:
      # isolated_cores=2,4-7
      # isolated_cores=2-23
      #
      # Reserve 1 core per socket for housekeeping, isolate the rest.
      # Change this for a core list or range as shown above.
      isolated_cores=2-3 (1)
      
      #
      # Uncomment the 'isolate_managed_irq=Y' bellow if you want to move kernel
      # managed IRQs out of isolated cores. Note that this requires kernel
      # support. Please only specify this parameter if you are sure that the
      # kernel supports it.
      #
      isolate_managed_irq=Y (2)
      
      #
      # Set the desired combined queue count value using the parameter provided
      # below. Ideally this should be set to the number of housekeeping CPUs i.e.,
      # in the example given below it is assumed that the system has 4 housekeeping
      # (non-isolated) CPUs.
      #
      # netdev_queue_count=4
      1 The first two CPUs (0 and 1) are set aside for house keeping tasks and the rest are isolated for the real-time application.
      2 Set the isolate_managed_irq parameter to Y to move kernel-managed interrupt requests out of isolated cores.
    8. Activate the TuneD profile:

      # tuned-adm profile realtime-virtual-guest
    9. Set the real-time kernel as the default by using the GRUB boot loader command-line interface:

      # grubby --set-default=/boot/vmlinuz-<installed_rt_kernel_version>
    10. Set the kernel arguments by using the GRUB boot loader command-line interface:

      # grubby --args="iommu=pt intel_iommu=on default_hugepagesz=1G idle=poll" --update-kernel=$(grubby --default-kernel)
  4. Restart the VM to apply the changes.

Verification
  • Use the cyclictest tool to verify that the real-time guest is configured properly:

    # cyclictest --priority 1 --policy fifo -h 50 -a 2-3 --mainaffinity 0,1 -t 2 -m -q -i 200 -D 12h

    where:

    -a

    Specifies the CPU set on which the test runs. This is the same as the isolated CPUs that you configured in the realtime-variables.conf file.

    -D

    Specifies the test duration. Append m, h, or d to specify minutes, hours or days.

    Example output
    # Min Latencies: 00004 00004
    # Avg Latencies: 00004 00004
    # Max Latencies: 00014 00013 (1)
    1 The Max Latencies value in the output must be less than 40 micro seconds.