×

About private clusters

By default, OKD is provisioned using publicly-accessible DNS and endpoints. You can set the DNS, Ingress Controller, and API server to private after you deploy your private cluster.

If the cluster has any public subnets, load balancer services created by administrators might be publicly accessible. To ensure cluster security, verify that these services are explicitly annotated as private.

DNS

If you install OKD on installer-provisioned infrastructure, the installation program creates records in a pre-existing public zone and, where possible, creates a private zone for the cluster’s own DNS resolution. In both the public zone and the private zone, the installation program or cluster creates DNS entries for *.apps, for the Ingress object, and api, for the API server.

The *.apps records in the public and private zone are identical, so when you delete the public zone, the private zone seamlessly provides all DNS resolution for the cluster.

Ingress Controller

Because the default Ingress object is created as public, the load balancer is internet-facing and in the public subnets.

The Ingress Operator generates a default certificate for an Ingress Controller to serve as a placeholder until you configure a custom default certificate. Do not use Operator-generated default certificates in production clusters. The Ingress Operator does not rotate its own signing certificate or the default certificates that it generates. Operator-generated default certificates are intended as placeholders for custom default certificates that you configure.

API server

By default, the installation program creates appropriate network load balancers for the API server to use for both internal and external traffic.

On Amazon Web Services (AWS), separate public and private load balancers are created. The load balancers are identical except that an additional port is available on the internal one for use within the cluster. Although the installation program automatically creates or destroys the load balancer based on API server requirements, the cluster does not manage or maintain them. As long as you preserve the cluster’s access to the API server, you can manually modify or move the load balancers. For the public load balancer, port 6443 is open and the health check is configured for HTTPS against the /readyz path.

On Google Cloud Platform, a single load balancer is created to manage both internal and external API traffic, so you do not need to modify the load balancer.

On Microsoft Azure, both public and private load balancers are created. However, because of limitations in current implementation, you just retain both load balancers in a private cluster.

Configuring DNS records to be published in a private zone

For all OKD clusters, whether public or private, DNS records are published in a public zone by default.

You can remove the public zone from the cluster DNS configuration to avoid exposing DNS records to the public. You might want to avoid exposing sensitive information, such as internal domain names, internal IP addresses, or the number of clusters at an organization, or you might simply have no need to publish records publicly. If all the clients that should be able to connect to services within the cluster use a private DNS service that has the DNS records from the private zone, then there is no need to have a public DNS record for the cluster.

After you deploy a cluster, you can modify its DNS to use only a private zone by modifying the DNS custom resource (CR). Modifying the DNS CR in this way means that any DNS records that are subsequently created are not published to public DNS servers, which keeps knowledge of the DNS records isolated to internal users. This can be done when you configure the cluster to be private, or if you never want DNS records to be publicly resolvable.

Alternatively, even in a private cluster, you might keep the public zone for DNS records because it allows clients to resolve DNS names for applications running on that cluster. For example, an organization can have machines that connect to the public internet and then establish VPN connections for certain private IP ranges in order to connect to private IP addresses. The DNS lookups from these machines use the public DNS to determine the private addresses of those services, and then connect to the private addresses over the VPN.

Procedure
  1. Review the DNS CR for your cluster by running the following command and observing the output:

    $ oc get dnses.config.openshift.io/cluster -o yaml
    Example output
    apiVersion: config.openshift.io/v1
    kind: DNS
    metadata:
      creationTimestamp: "2019-10-25T18:27:09Z"
      generation: 2
      name: cluster
      resourceVersion: "37966"
      selfLink: /apis/config.openshift.io/v1/dnses/cluster
      uid: 0e714746-f755-11f9-9cb1-02ff55d8f976
    spec:
      baseDomain: <base_domain>
      privateZone:
        tags:
          Name: <infrastructure_id>-int
          kubernetes.io/cluster/<infrastructure_id>: owned
      publicZone:
        id: Z2XXXXXXXXXXA4
    status: {}

    Note that the spec section contains both a private and a public zone.

  2. Patch the DNS CR to remove the public zone by running the following command:

    $ oc patch dnses.config.openshift.io/cluster --type=merge --patch='{"spec": {"publicZone": null}}'
    Example output
    dns.config.openshift.io/cluster patched

    The Ingress Operator consults the DNS CR definition when it creates DNS records for IngressController objects. If only private zones are specified, only private records are created.

    Existing DNS records are not modified when you remove the public zone. You must manually delete previously published public DNS records if you no longer want them to be published publicly.

Verification
  • Review the DNS CR for your cluster and confirm that the public zone was removed, by running the following command and observing the output:

    $ oc get dnses.config.openshift.io/cluster -o yaml
    Example output
    apiVersion: config.openshift.io/v1
    kind: DNS
    metadata:
      creationTimestamp: "2019-10-25T18:27:09Z"
      generation: 2
      name: cluster
      resourceVersion: "37966"
      selfLink: /apis/config.openshift.io/v1/dnses/cluster
      uid: 0e714746-f755-11f9-9cb1-02ff55d8f976
    spec:
      baseDomain: <base_domain>
      privateZone:
        tags:
          Name: <infrastructure_id>-int
          kubernetes.io/cluster/<infrastructure_id>-wfpg4: owned
    status: {}

Setting the Ingress Controller to private

After you deploy a cluster, you can modify its Ingress Controller to use only a private zone.

Procedure
  1. Modify the default Ingress Controller to use only an internal endpoint:

    $ oc replace --force --wait --filename - <<EOF
    apiVersion: operator.openshift.io/v1
    kind: IngressController
    metadata:
      namespace: openshift-ingress-operator
      name: default
    spec:
      endpointPublishingStrategy:
        type: LoadBalancerService
        loadBalancer:
          scope: Internal
    EOF
    Example output
    ingresscontroller.operator.openshift.io "default" deleted
    ingresscontroller.operator.openshift.io/default replaced

    The public DNS entry is removed, and the private zone entry is updated.

Restricting the API server to private

After you deploy a cluster to Amazon Web Services (AWS) or Microsoft Azure, you can reconfigure the API server to use only the private zone.

Prerequisites
  • Install the OpenShift CLI (oc).

  • Have access to the web console as a user with admin privileges.

Procedure
  1. In the web portal or console for your cloud provider, take the following actions:

    1. Locate and delete the appropriate load balancer component:

      • For AWS, delete the external load balancer. The API DNS entry in the private zone already points to the internal load balancer, which uses an identical configuration, so you do not need to modify the internal load balancer.

      • For Azure, delete the api-internal-v4 rule for the public load balancer.

    2. For Azure, configure the Ingress Controller endpoint publishing scope to Internal. For more information, see "Configuring the Ingress Controller endpoint publishing scope to Internal".

    3. For the Azure public load balancer, if you configure the Ingress Controller endpoint publishing scope to Internal and there are no existing inbound rules in the public load balancer, you must create an outbound rule explicitly to provide outbound traffic for the backend address pool. For more information, see the Microsoft Azure documentation about adding outbound rules.

    4. Delete the api.$clustername.$yourdomain or api.$clustername DNS entry in the public zone.

  2. AWS clusters: Remove the external load balancers:

    You can run the following steps only for an installer-provisioned infrastructure (IPI) cluster. For a user-provisioned infrastructure (UPI) cluster, you must manually remove or disable the external load balancers.

    • If your cluster uses a control plane machine set, delete the lines in the control plane machine set custom resource that configure your public or external load balancer:

      # ...
      providerSpec:
        value:
      # ...
          loadBalancers:
          - name: lk4pj-ext (1)
            type: network (2)
          - name: lk4pj-int
            type: network
      # ...
      1 Delete the name value for the external load balancer, which ends in -ext.
      2 Delete the type value for the external load balancer.
    • If your cluster does not use a control plane machine set, you must delete the external load balancers from each control plane machine.

      1. From your terminal, list the cluster machines by running the following command:

        $ oc get machine -n openshift-machine-api
        Example output
        NAME                            STATE     TYPE        REGION      ZONE         AGE
        lk4pj-master-0                  running   m4.xlarge   us-east-1   us-east-1a   17m
        lk4pj-master-1                  running   m4.xlarge   us-east-1   us-east-1b   17m
        lk4pj-master-2                  running   m4.xlarge   us-east-1   us-east-1a   17m
        lk4pj-worker-us-east-1a-5fzfj   running   m4.xlarge   us-east-1   us-east-1a   15m
        lk4pj-worker-us-east-1a-vbghs   running   m4.xlarge   us-east-1   us-east-1a   15m
        lk4pj-worker-us-east-1b-zgpzg   running   m4.xlarge   us-east-1   us-east-1b   15m

        The control plane machines contain master in the name.

      2. Remove the external load balancer from each control plane machine:

        1. Edit a control plane machine object to by running the following command:

          $ oc edit machines -n openshift-machine-api <control_plane_name> (1)
          1 Specify the name of the control plane machine object to modify.
        2. Remove the lines that describe the external load balancer, which are marked in the following example:

          # ...
          providerSpec:
            value:
          # ...
              loadBalancers:
              - name: lk4pj-ext (1)
                type: network (2)
              - name: lk4pj-int
                type: network
          # ...
          1 Delete the name value for the external load balancer, which ends in -ext.
          2 Delete the type value for the external load balancer.
        3. Save your changes and exit the object specification.

        4. Repeat this process for each of the control plane machines.

Configuring a private storage endpoint on Azure

You can leverage the Image Registry Operator to use private endpoints on Azure, which enables seamless configuration of private storage accounts when OKD is deployed on private Azure clusters. This allows you to deploy the image registry without exposing public-facing storage endpoints.

Do not configure a private storage endpoint on Microsoft Azure Red Hat OpenShift (ARO), because the endpoint can put your Microsoft Azure Red Hat OpenShift cluster in an unrecoverable state.

You can configure the Image Registry Operator to use private storage endpoints on Azure in one of two ways:

  • By configuring the Image Registry Operator to discover the VNet and subnet names

  • With user-provided Azure Virtual Network (VNet) and subnet names

Limitations for configuring a private storage endpoint on Azure

The following limitations apply when configuring a private storage endpoint on Azure:

  • When configuring the Image Registry Operator to use a private storage endpoint, public network access to the storage account is disabled. Consequently, pulling images from the registry outside of OKD only works by setting disableRedirect: true in the registry Operator configuration. With redirect enabled, the registry redirects the client to pull images directly from the storage account, which will no longer work due to disabled public network access. For more information, see "Disabling redirect when using a private storage endpoint on Azure".

  • Due to a change in storage account naming in OKD 4, the Azure File Container Storage Interface (CSI) driver now alphabetically matches storage account of the Image Registry Operator. With this change, there is a known issue where the Azure File CSI driver fails to mount all volumes when the image registry is configured as private. The mount failures occur because the CSI driver tries to use the storage account of the Image Registry Operator, which is not configured to allow connections from worker subnets.

    As a temporary workaround, the image registry should not be configured as private when using the Azure File CSI driver. This is a known issue and will be fixed in a future version of OKD. (OCPBUGS-42308)

  • This operation cannot be undone by the Image Registry Operator.

Configuring a private storage endpoint on Azure by enabling the Image Registry Operator to discover VNet and subnet names

The following procedure shows you how to set up a private storage endpoint on Azure by configuring the Image Registry Operator to discover VNet and subnet names.

Prerequisites
  • You have configured the image registry to run on Azure.

  • Your network has been set up using the Installer Provisioned Infrastructure installation method.

    For users with a custom network setup, see "Configuring a private storage endpoint on Azure with user-provided VNet and subnet names".

Procedure
  1. Edit the Image Registry Operator config object and set networkAccess.type to Internal:

    $ oc edit configs.imageregistry/cluster
    # ...
    spec:
      # ...
       storage:
          azure:
            # ...
            networkAccess:
              type: Internal
    # ...
  2. Optional: Enter the following command to confirm that the Operator has completed provisioning. This might take a few minutes.

    $ oc get configs.imageregistry/cluster -o=jsonpath="{.spec.storage.azure.privateEndpointName}" -w
  3. Optional: If the registry is exposed by a route, and you are configuring your storage account to be private, you must disable redirect if you want pulls external to the cluster to continue to work. Enter the following command to disable redirect on the Image Operator configuration:

    $ oc patch configs.imageregistry cluster --type=merge -p '{"spec":{"disableRedirect": true}}'

    When redirect is enabled, pulling images from outside of the cluster will not work.

Verification
  1. Fetch the registry service name by running the following command:

    $ oc get imagestream -n openshift
    Example output
    NAME   IMAGE REPOSITORY                                                 TAGS     UPDATED
    cli    image-registry.openshift-image-registry.svc:5000/openshift/cli   latest   8 hours ago
    ...
  2. Enter debug mode by running the following command:

    $ oc debug node/<node_name>
  3. Run the suggested chroot command. For example:

    $ chroot /host
  4. Enter the following command to log in to your container registry:

    $ podman login --tls-verify=false -u unused -p $(oc whoami -t) image-registry.openshift-image-registry.svc:5000
    Example output
    Login Succeeded!
  5. Enter the following command to verify that you can pull an image from the registry:

    $ podman pull --tls-verify=false image-registry.openshift-image-registry.svc:5000/openshift/tools
    Example output
    Trying to pull image-registry.openshift-image-registry.svc:5000/openshift/tools/openshift/tools...
    Getting image source signatures
    Copying blob 6b245f040973 done
    Copying config 22667f5368 done
    Writing manifest to image destination
    Storing signatures
    22667f53682a2920948d19c7133ab1c9c3f745805c14125859d20cede07f11f9

Configuring a private storage endpoint on Azure with user-provided VNet and subnet names

Use the following procedure to configure a storage account that has public network access disabled and is exposed behind a private storage endpoint on Azure.

Prerequisites
  • You have configured the image registry to run on Azure.

  • You must know the VNet and subnet names used for your Azure environment.

  • If your network was configured in a separate resource group in Azure, you must also know its name.

Procedure
  1. Edit the Image Registry Operator config object and configure the private endpoint using your VNet and subnet names:

    $ oc edit configs.imageregistry/cluster
    # ...
    spec:
      # ...
       storage:
          azure:
            # ...
            networkAccess:
              type: Internal
              internal:
                subnetName: <subnet_name>
                vnetName: <vnet_name>
                networkResourceGroupName: <network_resource_group_name>
    # ...
  2. Optional: Enter the following command to confirm that the Operator has completed provisioning. This might take a few minutes.

    $ oc get configs.imageregistry/cluster -o=jsonpath="{.spec.storage.azure.privateEndpointName}" -w

    When redirect is enabled, pulling images from outside of the cluster will not work.

Verification
  1. Fetch the registry service name by running the following command:

    $ oc get imagestream -n openshift
    Example output
    NAME   IMAGE REPOSITORY                                                 TAGS     UPDATED
    cli    image-registry.openshift-image-registry.svc:5000/openshift/cli   latest   8 hours ago
    ...
  2. Enter debug mode by running the following command:

    $ oc debug node/<node_name>
  3. Run the suggested chroot command. For example:

    $ chroot /host
  4. Enter the following command to log in to your container registry:

    $ podman login --tls-verify=false -u unused -p $(oc whoami -t) image-registry.openshift-image-registry.svc:5000
    Example output
    Login Succeeded!
  5. Enter the following command to verify that you can pull an image from the registry:

    $ podman pull --tls-verify=false image-registry.openshift-image-registry.svc:5000/openshift/tools
    Example output
    Trying to pull image-registry.openshift-image-registry.svc:5000/openshift/tools/openshift/tools...
    Getting image source signatures
    Copying blob 6b245f040973 done
    Copying config 22667f5368 done
    Writing manifest to image destination
    Storing signatures
    22667f53682a2920948d19c7133ab1c9c3f745805c14125859d20cede07f11f9

Optional: Disabling redirect when using a private storage endpoint on Azure

By default, redirect is enabled when using the image registry. Redirect allows off-loading of traffic from the registry pods into the object storage, which makes pull faster. When redirect is enabled and the storage account is private, users from outside of the cluster are unable to pull images from the registry.

In some cases, users might want to disable redirect so that users from outside of the cluster can pull images from the registry.

Use the following procedure to disable redirect.

Prerequisites
  • You have configured the image registry to run on Azure.

  • You have configured a route.

Procedure
  • Enter the following command to disable redirect on the image registry configuration:

    $ oc patch configs.imageregistry cluster --type=merge -p '{"spec":{"disableRedirect": true}}'
Verification
  1. Fetch the registry service name by running the following command:

    $ oc get imagestream -n openshift
    Example output
    NAME   IMAGE REPOSITORY                                           TAGS     UPDATED
    cli    default-route-openshift-image-registry.<cluster_dns>/cli   latest   8 hours ago
    ...
  2. Enter the following command to log in to your container registry:

    $ podman login --tls-verify=false -u unused -p $(oc whoami -t) default-route-openshift-image-registry.<cluster_dns>
    Example output
    Login Succeeded!
  3. Enter the following command to verify that you can pull an image from the registry:

    $ podman pull --tls-verify=false default-route-openshift-image-registry.<cluster_dns>
    /openshift/tools
    Example output
    Trying to pull default-route-openshift-image-registry.<cluster_dns>/openshift/tools...
    Getting image source signatures
    Copying blob 6b245f040973 done
    Copying config 22667f5368 done
    Writing manifest to image destination
    Storing signatures
    22667f53682a2920948d19c7133ab1c9c3f745805c14125859d20cede07f11f9