$ oc edit featuregate cluster
Linux user namespaces allow administrators to isolate the container user and group identifiers (UIDs and GIDs) so that a container can have a different set of permissions in the user namespace than on the host system where it is running. This allows containers to run processes with full privileges inside the user namespace, but the processes can be unprivileged for operations on the host machine.
By default, a container runs in the host system’s root user namespace. Running a container in the host user namespace can be useful when the container needs a feature that is available only in that user namespace. However, it introduces security concerns, such as the possibility of container breakouts, in which a process inside a container breaks out onto the host where the process can access or modify files on the host or in other containers.
Running containers in individual user namespaces can mitigate container breakouts and several other vulnerabilities that a compromised container can pose to other pods and the node itself.
You can configure Linux user namespace use by setting the hostUsers
parameter to false
in the pod spec, as shown in the following procedure.
Support for Linux user namespaces is a Technology Preview feature only. Technology Preview features are not supported with Red Hat production service level agreements (SLAs) and might not be functionally complete. Red Hat does not recommend using them in production. These features provide early access to upcoming product features, enabling customers to test functionality and provide feedback during the development process. For more information about the support scope of Red Hat Technology Preview features, see Technology Preview Features Support Scope. |
You enabled the required Technology Preview features for your cluster by editing the FeatureGate
CR named cluster
:
$ oc edit featuregate cluster
FeatureGate
CRapiVersion: config.openshift.io/v1
kind: FeatureGate
metadata:
name: cluster
spec:
featureSet: TechPreviewNoUpgrade (1)
1 | Enables the required UserNamespacesSupport and ProcMountType features. |
Enabling the |
After you save the changes, new machine configs are created, the machine config pools are updated, and scheduling on each node is disabled while the change is being applied.
The crun container runtime is present on the worker nodes. crun is currently the only OCI runtime packaged with OKD that supports user namespaces. crun is active by default.
apiVersion: machineconfiguration.openshift.io/v1
kind: ContainerRuntimeConfig
metadata:
name: enable-crun-worker
spec:
machineConfigPoolSelector:
matchLabels:
pools.operator.machineconfiguration.openshift.io/worker: "" (1)
containerRuntimeConfig:
defaultRuntime: crun (2)
1 | Specifies the machine config pool label. |
2 | Specifies the container runtime to deploy. |
Edit the default user ID (UID) and group ID (GID) range of the OKD namespace where your pod is deployed by running the following command:
$ oc edit ns/<namespace_name>
apiVersion: v1
kind: Namespace
metadata:
annotations:
openshift.io/description: ""
openshift.io/display-name: ""
openshift.io/requester: system:admin
openshift.io/sa.scc.mcs: s0:c27,c24
openshift.io/sa.scc.supplemental-groups: 1000/10000 (1)
openshift.io/sa.scc.uid-range: 1000/10000 (2)
# ...
name: userns
# ...
1 | Edit the default GID to match the value you specified in the pod spec. The range for a Linux user namespace must be lower than 65,535. The default is 1000000000/10000 . |
2 | Edit the default UID to match the value you specified in the pod spec. The range for a Linux user namespace must be lower than 65,535. The default is 1000000000/10000 . |
The range 1000/10000 means 10,000 values starting with ID 1000, so it specifies the range of IDs from 1000 to 10,999. |
Enable the use of Linux user namespaces by creating a pod configured to run with a restricted
profile and with the hostUsers
parameter set to false
.
Create a YAML file similar to the following:
apiVersion: v1
kind: Pod
metadata:
name: userns-pod
# ...
spec:
containers:
- name: userns-container
image: registry.access.redhat.com/ubi9
command: ["sleep", "1000"]
securityContext:
capabilities:
drop: ["ALL"]
allowPrivilegeEscalation: false (1)
runAsNonRoot: true (2)
seccompProfile:
type: RuntimeDefault
runAsUser: 1000 (3)
runAsGroup: 1000 (4)
hostUsers: false (5)
# ...
1 | Specifies that a pod cannot request privilege escalation. This is required for the restricted-v2 security context constraints (SCC). |
2 | Specifies that the container will run with a user with any UID other than 0. |
3 | Specifies the UID the container is run with. |
4 | Specifies which primary GID the containers is run with. |
5 | Requests that the pod is to be run in a user namespace. If true , the pod runs in the host user namespace. If false , the pod runs in a new user namespace that is created for the pod. The default is true . |
Create the pod by running the following command:
$ oc create -f <file_name>.yaml
Check the pod user and group IDs being used in the pod container you created. The pod is inside the Linux user namespace.
Start a shell session with the container in your pod:
$ oc rsh -c <container_name> pod/<pod_name>
$ oc rsh -c userns-container_name pod/userns-pod
Display the user and group IDs being used inside the container:
sh-5.1$ id
uid=1000(1000) gid=1000(1000) groups=1000(1000)
Display the user ID being used in the container user namespace:
sh-5.1$ lsns -t user
NS TYPE NPROCS PID USER COMMAND
4026532447 user 3 1 1000 /usr/bin/coreutils --coreutils-prog-shebang=sleep /usr/bin/sleep 1000 (1)
1 | The UID for the process is 1000 , the same as you set in the pod spec. |
Check the pod user ID being used on the node where the pod was created. The node is outside of the Linux user namespace. This user ID should be different from the UID being used in the container.
Start a debug session for that node:
$ oc debug node/ci-ln-z5vppzb-72292-8zp2b-worker-c-q8sh9
$ oc debug node/ci-ln-z5vppzb-72292-8zp2b-worker-c-q8sh9
Set /host
as the root directory within the debug shell:
sh-5.1# chroot /host
Display the user ID being used in the node user namespace:
sh-5.1# lsns -t user
NS TYPE NPROCS PID USER COMMAND
4026531837 user 233 1 root /usr/lib/systemd/systemd --switched-root --system --deserialize 28
4026532447 user 1 4767 2908816384 /usr/bin/coreutils --coreutils-prog-shebang=sleep /usr/bin/sleep 1000 (1)
1 | The UID for the process is 2908816384 , which is different from what you set in the pod spec. |