×

Linux user namespaces allow administrators to isolate the container user and group identifiers (UIDs and GIDs) so that a container can have a different set of permissions in the user namespace than on the host system where it is running. This allows containers to run processes with full privileges inside the user namespace, but the processes can be unprivileged for operations on the host machine.

By default, a container runs in the host system’s root user namespace. Running a container in the host user namespace can be useful when the container needs a feature that is available only in that user namespace. However, it introduces security concerns, such as the possibility of container breakouts, in which a process inside a container breaks out onto the host where the process can access or modify files on the host or in other containers.

Running containers in individual user namespaces can mitigate container breakouts and several other vulnerabilities that a compromised container can pose to other pods and the node itself.

You can configure Linux user namespace use by setting the hostUsers parameter to false in the pod spec, as shown in the following procedure.

Support for Linux user namespaces is a Technology Preview feature only. Technology Preview features are not supported with Red Hat production service level agreements (SLAs) and might not be functionally complete. Red Hat does not recommend using them in production. These features provide early access to upcoming product features, enabling customers to test functionality and provide feedback during the development process.

For more information about the support scope of Red Hat Technology Preview features, see Technology Preview Features Support Scope.

Configuring Linux user namespace support

Prerequisites
  • You enabled the required Technology Preview features for your cluster by editing the FeatureGate CR named cluster:

    $ oc edit featuregate cluster
    Example FeatureGate CR
    apiVersion: config.openshift.io/v1
    kind: FeatureGate
    metadata:
      name: cluster
    spec:
      featureSet: TechPreviewNoUpgrade (1)
    1 Enables the required UserNamespacesSupport and ProcMountType features.

    Enabling the TechPreviewNoUpgrade feature set on your cluster cannot be undone and prevents minor version updates. This feature set allows you to enable these Technology Preview features on test clusters, where you can fully test them. Do not enable this feature set on production clusters.

    After you save the changes, new machine configs are created, the machine config pools are updated, and scheduling on each node is disabled while the change is being applied.

  • You enabled the crun container runtime on the worker nodes. crun is currently the only released OCI runtime with support for user namespaces.

    apiVersion: machineconfiguration.openshift.io/v1
    kind: ContainerRuntimeConfig
    metadata:
     name: enable-crun-worker
    spec:
     machineConfigPoolSelector:
       matchLabels:
         pools.operator.machineconfiguration.openshift.io/worker: "" (1)
     containerRuntimeConfig:
       defaultRuntime: crun (2)
    1 Specifies the machine config pool label.
    2 Specifies the container runtime to deploy.
Procedure
  1. Edit the default user ID (UID) and group ID (GID) range of the OKD namespace where your pod is deployed by running the following command:

    $ oc edit ns/<namespace_name>
    Example namespace
    apiVersion: v1
    kind: Namespace
    metadata:
      annotations:
        openshift.io/description: ""
        openshift.io/display-name: ""
        openshift.io/requester: system:admin
        openshift.io/sa.scc.mcs: s0:c27,c24
        openshift.io/sa.scc.supplemental-groups: 1000/10000 (1)
        openshift.io/sa.scc.uid-range: 1000/10000 (2)
    # ...
    name: userns
    # ...
    1 Edit the default GID to match the value you specified in the pod spec. The range for a Linux user namespace must be lower than 65,535. The default is 1000000000/10000.
    2 Edit the default UID to match the value you specified in the pod spec. The range for a Linux user namespace must be lower than 65,535. The default is 1000000000/10000.

    The range 1000/10000 means 10,000 values starting with ID 1000, so it specifies the range of IDs from 1000 to 10,999.

  2. Enable the use of Linux user namespaces by creating a pod configured to run with a restricted profile and with the hostUsers parameter set to false.

    1. Create a YAML file similar to the following:

      Example pod specification
      apiVersion: v1
      kind: Pod
      metadata:
        name: userns-pod
      
      # ...
      
      spec:
        containers:
        - name: userns-container
          image: registry.access.redhat.com/ubi9
          command: ["sleep", "1000"]
          securityContext:
            capabilities:
              drop: ["ALL"]
            allowPrivilegeEscalation: false (1)
            runAsNonRoot: true (2)
            seccompProfile:
              type: RuntimeDefault
            runAsUser: 1000 (3)
            runAsGroup: 1000 (4)
        hostUsers: false (5)
      
      # ...
      1 Specifies that a pod cannot request privilege escalation. This is required for the restricted-v2 security context constraints (SCC).
      2 Specifies that the container will run with a user with any UID other than 0.
      3 Specifies the UID the container is run with.
      4 Specifies which primary GID the containers is run with.
      5 Requests that the pod is to be run in a user namespace. If true, the pod runs in the host user namespace. If false, the pod runs in a new user namespace that is created for the pod. The default is true.
    2. Create the pod by running the following command:

      $ oc create -f <file_name>.yaml
Verification
  1. Check the pod user and group IDs being used in the pod container you created. The pod is inside the Linux user namespace.

    1. Start a shell session with the container in your pod:

      $ oc rsh -c <container_name> pod/<pod_name>
      Example command
      $ oc rsh -c userns-container_name pod/userns-pod
    2. Display the user and group IDs being used inside the container:

      sh-5.1$ id
      Example output
      uid=1000(1000) gid=1000(1000) groups=1000(1000)
    3. Display the user ID being used in the container user namespace:

      sh-5.1$ lsns -t user
      Example output
              NS TYPE  NPROCS PID USER COMMAND
      4026532447 user       3   1 1000 /usr/bin/coreutils --coreutils-prog-shebang=sleep /usr/bin/sleep 1000 (1)
      
      1 The UID for the process is 1000, the same as you set in the pod spec.
  2. Check the pod user ID being used on the node where the pod was created. The node is outside of the Linux user namespace. This user ID should be different from the UID being used in the container.

    1. Start a debug session for that node:

      $ oc debug node/ci-ln-z5vppzb-72292-8zp2b-worker-c-q8sh9
      Example command
      $ oc debug node/ci-ln-z5vppzb-72292-8zp2b-worker-c-q8sh9
    2. Set /host as the root directory within the debug shell:

      sh-5.1# chroot /host
    3. Display the user ID being used in the node user namespace:

      sh-5.1#  lsns -t user
      Example command
              NS TYPE  NPROCS   PID USER       COMMAND
      4026531837 user     233     1 root       /usr/lib/systemd/systemd --switched-root --system --deserialize 28
      4026532447 user       1  4767 2908816384 /usr/bin/coreutils --coreutils-prog-shebang=sleep /usr/bin/sleep 1000 (1)
      
      1 The UID for the process is 2908816384, which is different from what you set in the pod spec.