Virtual machine (VM) workloads run as unprivileged pods. So that VMs can use OKD Virtualization features, some pods are granted custom security policies that are not available to other pod owners:

  • An extended container_t SELinux policy applies to virt-launcher pods.

  • Security context constraints (SCCs) are defined for the kubevirt-controller service account.

About workload security

By default, virtual machine (VM) workloads do not run with root privileges in OKD Virtualization.

For each VM, a virt-launcher pod runs an instance of libvirt in session mode to manage the VM process. In session mode, the libvirt daemon runs as a non-root user account and only permits connections from clients that are running under the same user identifier (UID). Therefore, VMs run as unprivileged pods, adhering to the security principle of least privilege.

There are no supported OKD Virtualization features that require root privileges. If a feature requires root, it might not be supported for use with OKD Virtualization.

Extended SELinux policies for virt-launcher pods

The container_t SELinux policy for virt-launcher pods is extended with the following rules:

  • allow process self (tun_socket (relabelfrom relabelto attach_queue))

  • allow process sysfs_t (file (write))

  • allow process hugetlbfs_t (dir (add_name create write remove_name rmdir setattr))

  • allow process hugetlbfs_t (file (create unlink))

These rules enable the following virtualization features:

  • Relabel and attach queues to its own TUN sockets, which is required to support network multi-queue. Multi-queue enables network performance to scale as the number of available vCPUs increases.

  • Allows virt-launcher pods to write information to sysfs (/sys) files, which is required to enable Single Root I/O Virtualization (SR-IOV).

  • Read/write hugetlbfs entries, which is required to support huge pages. Huge pages are a method of managing large amounts of memory by increasing the memory page size.

Additional OKD security context constraints and Linux capabilities for the kubevirt-controller service account

Security context constraints (SCCs) control permissions for pods. These permissions include actions that a pod, a collection of containers, can perform and what resources it can access. You can use SCCs to define a set of conditions that a pod must run with to be accepted into the system.

The kubevirt-controller is a cluster controller that creates the virt-launcher pods for virtual machines in the cluster. These virt-launcher pods are granted permissions by the kubevirt-controller service account.

Additional SCCs granted to the kubevirt-controller service account

The kubevirt-controller service account is granted additional SCCs and Linux capabilities so that it can create virt-launcher pods with the appropriate permissions. These extended permissions allow virtual machines to take advantage of OKD Virtualization features that are beyond the scope of typical pods.

The kubevirt-controller service account is granted the following SCCs:

  • scc.AllowHostDirVolumePlugin = true
    This allows virtual machines to use the hostpath volume plug-in.

  • scc.AllowPrivilegedContainer = false
    This ensures the virt-launcher pod is not run as a privileged container.

  • scc.AllowedCapabilities = []corev1.Capability{"NET_ADMIN", "NET_RAW", "SYS_NICE"}
    This provides the following additional Linux capabilities NET_ADMIN, NET_RAW, and SYS_NICE.

Viewing the SCC and RBAC definitions for the kubevirt-controller

You can view the SecurityContextConstraints definition for the kubevirt-controller by using the oc tool:

$ oc get scc kubevirt-controller -o yaml

You can view the RBAC definition for the kubevirt-controller clusterrole by using the oc tool:

$ oc get clusterrole kubevirt-controller -o yaml