$ oc get scc kubevirt-controller -o yaml
Virtual machine (VM) workloads run as unprivileged pods. So that VMs can use OKD Virtualization features, some pods are granted custom security policies that are not available to other pod owners:
An extended container_t
SELinux policy applies to virt-launcher
pods.
Security context constraints (SCCs) are defined for the kubevirt-controller
service account.
By default, virtual machine (VM) workloads do not run with root privileges in OKD Virtualization.
For each VM, a virt-launcher
pod runs an instance of libvirt
in session mode to manage the VM process. In session mode, the libvirt
daemon runs as a non-root user account and only permits connections from clients that are running under the same user identifier (UID). Therefore, VMs run as unprivileged pods, adhering to the security principle of least privilege.
There are no supported OKD Virtualization features that require root privileges. If a feature requires root, it might not be supported for use with OKD Virtualization.
The container_t
SELinux policy for virt-launcher
pods is extended to enable essential functions of OKD Virtualization.
The following policy is required for network multi-queue, which enables network performance to scale as the number of available vCPUs increases:
allow process self (tun_socket (relabelfrom relabelto attach_queue))
The following policy allows virt-launcher
to read files under the /proc
directory, including /proc/cpuinfo
and /proc/uptime
:
allow process proc_type (file (getattr open read))
The following policy allows libvirtd
to relay network-related debug messages.
allow process self (netlink_audit_socket (nlmsg_relay))
Without this policy, any attempt to relay network debug messages is blocked. This might fill the node’s audit logs with SELinux denials. |
The following policies allow libvirtd
to access hugetblfs
, which is required to support huge pages:
allow process hugetlbfs_t (dir (add_name create write remove_name rmdir setattr))
allow process hugetlbfs_t (file (create unlink))
The following policies allow virtiofs
to mount filesystems and access NFS:
allow process nfs_t (dir (mounton))
allow process proc_t (dir (mounton))
allow process proc_t (filesystem (mount unmount))
Security context constraints (SCCs) control permissions for pods. These permissions include actions that a pod, a collection of containers, can perform and what resources it can access. You can use SCCs to define a set of conditions that a pod must run with to be accepted into the system.
The virt-controller
is a cluster controller that creates the virt-launcher
pods for virtual machines in the cluster. These pods are granted permissions by the kubevirt-controller
service account.
The kubevirt-controller
service account is granted additional SCCs and Linux capabilities so that it can create virt-launcher
pods with the appropriate permissions. These extended permissions allow virtual machines to use OKD Virtualization features that are beyond the scope of typical pods.
The kubevirt-controller
service account is granted the following SCCs:
scc.AllowHostDirVolumePlugin = true
This allows virtual machines to use the hostpath volume plugin.
scc.AllowPrivilegedContainer = false
This ensures the virt-launcher pod is not run as a privileged container.
scc.AllowedCapabilities = []corev1.Capability{"SYS_NICE", "NET_BIND_SERVICE", "SYS_PTRACE"}
SYS_NICE
allows setting the CPU affinity.
NET_BIND_SERVICE
allows DHCP and Slirp operations.
SYS_PTRACE
enables certain versions of libvirt
to find the process ID (PID) of swtpm
, a software Trusted Platform Module (TPM) emulator.
You can view the SecurityContextConstraints
definition for the kubevirt-controller
by using the oc
tool:
$ oc get scc kubevirt-controller -o yaml
You can view the RBAC definition for the kubevirt-controller
clusterrole by using the oc
tool:
$ oc get clusterrole kubevirt-controller -o yaml
Optimizing virtual machine network performance in the Fedora documentation
Configuring huge pages in the Fedora documentation