Deploying hosted control planes on IBM Power - Deploying hosted control planes | Hosted control planes

Prerequisites to configure hosted control planes on IBM Power
IBM Power infrastructure requirements
DNS configuration for hosted control planes on IBM Power
- Defining a custom DNS name
Creating a hosted cluster by using the CLI
About creating heterogeneous node pools on agent hosted clusters

You can deploy hosted control planes by configuring a cluster to function as a hosting cluster. This configuration provides an efficient and scalable solution for managing many clusters. The hosting cluster is an OKD cluster that hosts control planes. The hosting cluster is also known as the management cluster.

The management cluster is not the managed cluster. A managed cluster is a cluster that the hub cluster manages.

The multicluster engine Operator supports only the default local-cluster, which is a managed hub cluster, and the hub cluster as the hosting cluster.

To provision hosted control planes on bare-metal infrastructure, you can use the Agent platform. The Agent platform uses the central infrastructure management service to add compute nodes to a hosted cluster. For more information, see "Enabling the central infrastructure management service".

You must start each IBM Power host with a Discovery image that the central infrastructure management provides. After each host starts, it runs an Agent process to discover the details of the host and completes the installation. An Agent custom resource represents each host.

When you create a hosted cluster with the Agent platform, HyperShift installs the Agent Cluster API provider in the hosted control plane namespace.

Prerequisites to configure hosted control planes on IBM Power

The multicluster engine for Kubernetes Operator version 2.7 and later installed on an OKD cluster. The multicluster engine Operator is automatically installed when you install Red Hat Advanced Cluster Management (RHACM). You can also install the multicluster engine Operator without RHACM as an Operator from the OKD software catalog.
The multicluster engine Operator must have at least one managed OKD cluster. The local-cluster managed hub cluster is automatically imported in the multicluster engine Operator version 2.7 and later. For more information about local-cluster, see Advanced configuration in the RHACM documentation. You can check the status of your hub cluster by running the following command:
```
$ oc get managedclusters local-cluster
```
You need a hosting cluster with at least 3 compute nodes to run the HyperShift Operator.
You need to enable the central infrastructure management service. For more information, see "Enabling the central infrastructure management service".
You need to install the hosted control planes command-line interface. For more information, see "Installing the hosted control plane command-line interface".

The hosted control planes feature is enabled by default. If you disabled the feature and want to manually enable the feature, see "Manually enabling the hosted control planes feature". If you need to disable the feature, see "Disabling the hosted control planes feature".

Additional resources

IBM Power infrastructure requirements

The Agent platform does not create any infrastructure, but requires the following resources for infrastructure:

Agents: An Agent represents a host that boots with a Discovery image and that you can provision as an OKD node.
DNS: The API and Ingress endpoints must be routable.

DNS configuration for hosted control planes on IBM Power

Clients outside the network can access the API server for the hosted cluster. A DNS entry must exist for the api.<hosted_cluster_name>.<basedomain> entry that points to the destination where the API server is reachable.

The DNS entry can be as simple as a record that points to one of the nodes in the managed cluster that runs the hosted control plane.

The entry can also point to a deployed load balancer to redirect incoming traffic to the ingress pods.

See the following example of a DNS configuration:

$ cat /var/named/<example.krnl.es.zone>

Example output

$ TTL 900
@ IN  SOA bastion.example.krnl.es.com. hostmaster.example.krnl.es.com. (
      2019062002
      1D 1H 1W 3H )
  IN NS bastion.example.krnl.es.com.
;
;
api                   IN A 1xx.2x.2xx.1xx (1)
api-int               IN A 1xx.2x.2xx.1xx
;
;
*.apps.<hosted_cluster_name>.<basedomain>           IN A 1xx.2x.2xx.1xx
;
;EOF

1	The record refers to the IP address of the API load balancer that handles ingress and egress traffic for hosted control planes.

For IBM Power, add IP addresses that correspond to the IP address of the agent.

Example configuration

compute-0              IN A 1xx.2x.2xx.1yy
compute-1              IN A 1xx.2x.2xx.1yy

Defining a custom DNS name

As a cluster administrator, you can create a hosted cluster with an external API DNS name that differs from the internal endpoint that gets used for node bootstraps and control plane communication. You might want to define a different DNS name for the following reasons:

To replace the user-facing TLS certificate with one from a public CA without breaking the control plane functions that bind to the internal root CA.
To support split-horizon DNS and NAT scenarios.
To ensure a similar experience to standalone control planes, where you can use functions, such as the Show Login Command function, with the correct kubeconfig and DNS configuration.

You can define a DNS name either during your initial setup or during postinstallation operations, by entering a domain name in the kubeAPIServerDNSName parameter of a HostedCluster object.

Prerequisites

You have a valid TLS certificate that covers the DNS name that you set in the kubeAPIServerDNSName parameter.
You have a resolvable DNS name URI that can reach and point to the correct address.

Procedure

In the specification for the HostedCluster object, add the kubeAPIServerDNSName parameter and the address for the domain and specify which certificate to use, as shown in the following example:

#...
spec:
  configuration:
    apiServer:
      servingCerts:
        namedCertificates:
        - names:
          - xxx.example.com
          - yyy.example.com
          servingCertificate:
            name: <my_serving_certificate>
  kubeAPIServerDNSName: <custom_address> (1)

1	The value for the `kubeAPIServerDNSName` parameter must be a valid and addressable domain.

After you define the kubeAPIServerDNSName parameter and specify the certificate, the Control Plane Operator controllers create a kubeconfig file named custom-admin-kubeconfig, where the file gets stored in the HostedControlPlane namespace. The generation of certificates happen from the root CA, and the HostedControlPlane namespace manages their expiration and renewal.

The Control Plane Operator reports a new kubeconfig file named CustomKubeconfig in the HostedControlPlane namespace. That file uses the defined new server in the kubeAPIServerDNSName parameter.

A reference for the custom kubeconfig file exists in the status parameter as CustomKubeconfig of the HostedCluster object. The CustomKubeConfig parameter is optional, and you can add the parameter only if the kubeAPIServerDNSName parameter is not empty. After you set the CustomKubeConfig parameter, the parameter triggers the generation of a secret named <hosted_cluster_name>-custom-admin-kubeconfig in the HostedCluster namespace. You can use the secret to access the HostedCluster API server. If you remove the CustomKubeConfig parameter during postinstallation operations, deletion of all related secrets and status references occur.

Defining a custom DNS name does not directly impact the data plane, so no expected rollouts occur. The HostedControlPlane namespace receives the changes from the HyperShift Operator and deletes the corresponding parameters.

If you remove the kubeAPIServerDNSName parameter from the specification for the HostedCluster object, all newly generated secrets and the CustomKubeconfig reference are removed from the cluster and from the status parameter.

Creating a hosted cluster by using the CLI

On bare-metal infrastructure, you can create or import a hosted cluster. After you enable the Assisted Installer as an add-on to multicluster engine Operator and you create a hosted cluster with the Agent platform, the HyperShift Operator installs the Agent Cluster API provider in the hosted control plane namespace. The Agent Cluster API provider connects a management cluster that hosts the control plane and a hosted cluster that consists of only the compute nodes.

Prerequisites

Each hosted cluster must have a cluster-wide unique name. A hosted cluster name cannot be the same as any existing managed cluster. Otherwise, the multicluster engine Operator cannot manage the hosted cluster.
Do not use the word clusters as a hosted cluster name.
You cannot create a hosted cluster in the namespace of a multicluster engine Operator managed cluster.
For best security and management practices, create a hosted cluster separate from other hosted clusters.
Verify that you have a default storage class configured for your cluster. Otherwise, you might see pending persistent volume claims (PVCs).
By default when you use the hcp create cluster agent command, the command creates a hosted cluster with configured node ports. The preferred publishing strategy for hosted clusters on bare metal exposes services through a load balancer. If you create a hosted cluster by using the web console or by using Red Hat Advanced Cluster Management, to set a publishing strategy for a service besides the Kubernetes API server, you must manually specify the servicePublishingStrategy information in the HostedCluster custom resource.
Ensure that you meet the requirements described in "Requirements for hosted control planes on bare metal", which includes requirements related to infrastructure, firewalls, ports, and services. For example, those requirements describe how to add the appropriate zone labels to the bare-metal hosts in your management cluster, as shown in the following example commands:
```
$ oc label node [compute-node-1] topology.kubernetes.io/zone=zone1
```
```
$ oc label node [compute-node-2] topology.kubernetes.io/zone=zone2
```
```
$ oc label node [compute-node-3] topology.kubernetes.io/zone=zone3
```
Ensure that you have added bare-metal nodes to a hardware inventory.

Procedure

Create a namespace by entering the following command:
```
$ oc create ns <hosted_cluster_namespace>
```
Replace <hosted_cluster_namespace> with an identifier for your hosted cluster namespace. The HyperShift Operator creates the namespace. During the hosted cluster creation process on bare-metal infrastructure, a generated Cluster API provider role requires that the namespace already exists.

Create the configuration file for your hosted cluster by entering the following command:

$ hcp create cluster agent \
  --name=<hosted_cluster_name> \(1)
  --pull-secret=<path_to_pull_secret> \(2)
  --agent-namespace=<hosted_control_plane_namespace> \(3)
  --base-domain=<base_domain> \(4)
  --api-server-address=api.<hosted_cluster_name>.<base_domain> \(5)
  --etcd-storage-class=<etcd_storage_class> \(6)
  --ssh-key=<path_to_ssh_key> \(7)
  --namespace=<hosted_cluster_namespace> \(8)
  --control-plane-availability-policy=HighlyAvailable \(9)
  --release-image=quay.io/openshift-release-dev/ocp-release:<ocp_release_image>-multi \(10)
  --node-pool-replicas=<node_pool_replica_count> \(11)
  --render \
  --render-sensitive \
  --ssh-key <home_directory>/<path_to_ssh_key>/<ssh_key> > hosted-cluster-config.yaml (12)

1	Specify the name of your hosted cluster, such as `example`.
2	Specify the path to your pull secret, such as `/user/name/pullsecret`.
3	Specify your hosted control plane namespace, such as `clusters-example`. Ensure that agents are available in this namespace by using the `oc get agent -n <hosted_control_plane_namespace>` command.
4	Specify your base domain, such as `krnl.es`.
5	The `--api-server-address` flag defines the IP address that gets used for the Kubernetes API communication in the hosted cluster. If you do not set the `--api-server-address` flag, you must log in to connect to the management cluster.
6	Specify the etcd storage class name, such as `lvm-storageclass`.
7	Specify the path to your SSH public key. The default file path is `~/.ssh/id_rsa.pub`.
8	Specify your hosted cluster namespace.
9	Specify the availability policy for the hosted control plane components. Supported options are `SingleReplica` and `HighlyAvailable`. The default value is `HighlyAvailable`.
10	Specify the supported OKD version that you want to use, such as `4.20.0-multi`. If you are using a disconnected environment, replace `<ocp_release_image>` with the digest image. To extract the OKD release image digest, see Extracting the OKD release image digest.
11	Specify the node pool replica count, such as `3`. You must specify the replica count as `0` or greater to create the same number of replicas. Otherwise, you do not create node pools.
12	After the `--ssh-key` flag, specify the path to the SSH key, such as `user/.ssh/id_rsa`.

Configure the service publishing strategy. By default, hosted clusters use the NodePort service publishing strategy because node ports are always available without additional infrastructure. However, you can configure the service publishing strategy to use a load balancer.
- If you are using the default NodePort strategy, configure the DNS to point to the hosted cluster compute nodes, not the management cluster nodes. For more information, see "DNS configurations on bare metal".
- For production environments, use the LoadBalancer strategy because this strategy provides certificate handling and automatic DNS resolution. The following example demonstrates changing the service publishing LoadBalancer strategy in your hosted cluster configuration file:
  # ... spec: services: - service: APIServer servicePublishingStrategy: type: LoadBalancer (1) - service: Ignition servicePublishingStrategy: type: Route - service: Konnectivity servicePublishingStrategy: type: Route - service: OAuthServer servicePublishingStrategy: type: Route - service: OIDC servicePublishingStrategy: type: Route sshKey: name: <ssh_key> # ...
  1 Specify LoadBalancer as the API Server type. For all other services, specify Route as the type.
Apply the changes to the hosted cluster configuration file by entering the following command:
```
$ oc apply -f hosted_cluster_config.yaml
```

Check for the creation of the hosted cluster, node pools, and pods by entering the following commands:

$ oc get hostedcluster \
  <hosted_cluster_namespace> -n \
  <hosted_cluster_namespace> -o \
  jsonpath='{.status.conditions[?(@.status=="False")]}' | jq .

$ oc get nodepool \
  <hosted_cluster_namespace> -n \
  <hosted_cluster_namespace> -o \
  jsonpath='{.status.conditions[?(@.status=="False")]}' | jq .

$ oc get pods -n <hosted_cluster_namespace>

Confirm that the hosted cluster is ready. The status of Available: True indicates the readiness of the cluster and the node pool status shows AllMachinesReady: True. These statuses indicate the healthiness of all cluster Operators.

Install MetalLB in the hosted cluster:

Extract the kubeconfig file from the hosted cluster and set the environment variable for hosted cluster access by entering the following commands:

$ oc get secret \
  <hosted_cluster_namespace>-admin-kubeconfig \
  -n <hosted_cluster_namespace> \
  -o jsonpath='{.data.kubeconfig}' \
  | base64 -d > \
  kubeconfig-<hosted_cluster_namespace>.yaml

$ export KUBECONFIG="/path/to/kubeconfig-<hosted_cluster_namespace>.yaml"

Install the MetalLB Operator by creating the install-metallb-operator.yaml file:

apiVersion: v1
kind: Namespace
metadata:
  name: metallb-system
---
apiVersion: operators.coreos.com/v1
kind: OperatorGroup
metadata:
  name: metallb-operator
  namespace: metallb-system
---
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
  name: metallb-operator
  namespace: metallb-system
spec:
  channel: "stable"
  name: metallb-operator
  source: redhat-operators
  sourceNamespace: openshift-marketplace
  installPlanApproval: Automatic
# ...

Apply the file by entering the following command:
```
$ oc apply -f install-metallb-operator.yaml
```

Configure the MetalLB IP address pool by creating the deploy-metallb-ipaddresspool.yaml file:

apiVersion: metallb.io/v1beta1
kind: IPAddressPool
metadata:
  name: metallb
  namespace: metallb-system
spec:
  autoAssign: true
  addresses:
  - 10.11.176.71-10.11.176.75
---
apiVersion: metallb.io/v1beta1
kind: L2Advertisement
metadata:
  name: l2advertisement
  namespace: metallb-system
spec:
  ipAddressPools:
  - metallb
# ...

Apply the configuration by entering the following command:
```
$ oc apply -f deploy-metallb-ipaddresspool.yaml
```
Verify the installation of MetalLB by checking the Operator status, the IP address pool, and the L2Advertisement resource by entering the following commands:
```
$ oc get pods -n metallb-system
```
```
$ oc get ipaddresspool -n metallb-system
```
```
$ oc get l2advertisement -n metallb-system
```

Configure the load balancer for ingress:

Create the ingress-loadbalancer.yaml file:

apiVersion: v1
kind: Service
metadata:
  annotations:
    metallb.universe.tf/address-pool: metallb
  name: metallb-ingress
  namespace: openshift-ingress
spec:
  ports:
    - name: http
      protocol: TCP
      port: 80
      targetPort: 80
    - name: https
      protocol: TCP
      port: 443
      targetPort: 443
  selector:
    ingresscontroller.operator.openshift.io/deployment-ingresscontroller: default
  type: LoadBalancer
# ...

Apply the configuration by entering the following command:
```
$ oc apply -f ingress-loadbalancer.yaml
```

Verify that the load balancer service works as expected by entering the following command:

$ oc get svc metallb-ingress -n openshift-ingress

Example output

NAME              TYPE           CLUSTER-IP       EXTERNAL-IP    PORT(S)                      AGE
metallb-ingress   LoadBalancer   172.31.127.129   10.11.176.71   80:30961/TCP,443:32090/TCP   16h

Configure the DNS to work with the load balancer:
1. Configure the DNS for the apps domain by pointing the *.apps.<hosted_cluster_namespace>.<base_domain> wildcard DNS record to the load balancer IP address.
2. Verify the DNS resolution by entering the following command:
  $ nslookup console-openshift-console.apps.<hosted_cluster_namespace>.<base_domain> <load_balancer_ip_address>
  Example output
  Server: 10.11.176.1 Address: 10.11.176.1#53 Name: console-openshift-console.apps.my-hosted-cluster.sample-base-domain.com Address: 10.11.176.71

Verification

Check the cluster Operators by entering the following command:
```
$ oc get clusteroperators
```
Ensure that all Operators show AVAILABLE: True, PROGRESSING: False, and DEGRADED: False.
Check the nodes by entering the following command:
```
$ oc get nodes
```
Ensure that each node has the READY status.

Test access to the console by entering the following URL in a web browser:

https://console-openshift-console.apps.<hosted_cluster_namespace>.<base_domain>

Additional resources

About creating heterogeneous node pools on agent hosted clusters

A node pool is a group of nodes within a cluster that share the same configuration. Heterogeneous node pools have different configurations, so that you can create pools and optimize them for various workloads.

You can create heterogeneous node pools on the agent platform. The platform enables clusters to run diverse machine types, such as x86_64 or ppc64le, within a single hosted cluster.

Creating a heterogeneous node pool requires completion of the following general steps:

Create an AgentServiceConfig custom resource (CR) that informs the Operator how much storage it needs for components such as the database and filesystem. The CR also defines which OKD versions to support.
Create an agent cluster.
Create the heterogeneous node pool.
Configure DNS for hosted control planes
Create an InfraEnv custom resource (CR) for each architecture.
Add agents to the heterogeneous cluster.

Creating the AgentServiceConfig custom resource

To create heterogeneous node pools on an agent hosted cluster, you need to create the AgentServiceConfig CR with two heterogeneous architecture operating system (OS) images.

Procedure

Run the following command:

$ envsubst <<"EOF" | oc apply -f -
apiVersion: agent-install.openshift.io/v1beta1
kind: AgentServiceConfig
metadata:
 name: agent
spec:
  databaseStorage:
    accessModes:
    - ReadWriteOnce
    resources:
      requests:
        storage: <db_volume_name> (1)
  filesystemStorage:
    accessModes:
    - ReadWriteOnce
    resources:
      requests:
        storage: <fs_volume_name> (2)
  osImages:
    - openshiftVersion: <ocp_version> (3)
      version: <ocp_release_version_x86> (4)
      url: <iso_url_x86> (5)
      rootFSUrl: <root_fs_url_x8> (6)
      cpuArchitecture: <arch_x86> (7)
    - openshiftVersion: <ocp_version> (8)
      version: <ocp_release_version_ppc64le> (9)
      url: <iso_url_ppc64le> (10)
      rootFSUrl: <root_fs_url_ppc64le> (11)
      cpuArchitecture: <arch_ppc64le> (12)
EOF

1	Specify the multicluster engine for Kubernetes Operator `agentserviceconfig` config, database volume name.
2	Specify the multicluster engine Operator `agentserviceconfig` config, filesystem volume name.
3	Specify the current version of OKD.
4	Specify the current OKD release version for x86.
5	Specify the ISO URL for x86.
6	Specify the root filesystem URL for x86.
7	Specify the CPU architecture for x86.
8	Specify the current OKD version.
9	Specify the OKD release version for `ppc64le`.
10	Specify the ISO URL for `ppc64le`.
11	Specify the root filesystem URL for `ppc64le`.
12	Specify the CPU architecture for `ppc64le`.

Create an agent cluster

An agent-based approach manages and provisions an agent cluster. An agent cluster can use heterogeneous node pools, allowing the use of different types of compute nodes within the same cluster.

Prerequisites

You used a multi-architecture release image to enable support for heterogeneous node pools when creating a hosted cluster. Find the latest multi-architecture images on the Multi-arch release images page.

Procedure

Create an environment variable for the cluster namespace by running the following command:
```
$ export CLUSTERS_NAMESPACE=<hosted_cluster_namespace>
```
Create an environment variable for the machine classless inter-domain routing (CIDR) notation by running the following command:
```
$ export MACHINE_CIDR=192.168.122.0/24
```
Create the hosted control namespace by running the following command:
```
$ oc create ns <hosted_control_plane_namespace>
```

Create the cluster by running the following command:

$ hcp create cluster agent \
    --name=<hosted_cluster_name> \(1)
    --pull-secret=<pull_secret_file> \(2)
    --agent-namespace=<hosted_control_plane_namespace> \(3)
    --base-domain=<basedomain> \(4)
    --api-server-address=api.<hosted_cluster_name>.<basedomain> \
    --release-image=quay.io/openshift-release-dev/ocp-release:<ocp_release> (5)

1	Specify the hosted cluster name.
2	Specify the pull secret file path.
3	Specify the namespace for the hosted control plane.
4	Specify the base domain for the hosted cluster.
5	Specify the current OKD release version.

Creating heterogeneous node pools

You create heterogeneous node pools by using the NodePool custom resource (CR), so that you can optimize costs and performance by associating different workloads to specific hardware.

Procedure

To define a NodePool CR, create a YAML file similar to the following example:

envsubst <<"EOF" | oc apply -f -
apiVersion:apiVersion: hypershift.openshift.io/v1beta1
kind: NodePool
metadata:
  name: <hosted_cluster_name>
  namespace: <clusters_namespace>
spec:
  arch: <arch_ppc64le>
  clusterName: <hosted_cluster_name>
  management:
    autoRepair: false
    upgradeType: InPlace
  nodeDrainTimeout: 0s
  nodeVolumeDetachTimeout: 0s
  platform:
    agent:
      agentLabelSelector:
        matchLabels:
          inventory.agent-install.openshift.io/cpu-architecture: <arch_ppc64le> (1)
    type: Agent
  release:
    image: quay.io/openshift-release-dev/ocp-release:<ocp_release>
  replicas: 0
EOF

1	The selector block selects the agents that match the specified label. To create a node pool of architecture `ppc64le` with zero replicas, specify `ppc64le`. This ensures that the selector block selects only agents from `ppc64le` architecture during a scaling operation.

DNS configuration for hosted control planes

A Domain Name Service (DNS) configuration for hosted control planes means that external clients can reach ingress controllers, so that the clients can route traffic to internal components. Configuring this setting ensures that traffic gets routed to either a ppc64le or an x86_64 compute node.

You can point an *.apps.<cluster_name> record to either of the compute nodes that hosts the ingress application. Or, if you can set up a load balancer on top of the compute nodes, point the record to this load balancer. When you are creating a heterogeneous node pool, make sure the compute nodes can reach each other or keep them in the same network.

Creating infrastructure environment resources

For heterogeneous node pools, you must create an infraEnv custom resource (CR) for each architecture. This configuration ensures that the correct architecture-specific operating system and boot artifacts get used during the node provisioning process. For example, for node pools with x86_64 and ppc64le architectures, create an InfraEnv CR for x86_64 and ppc64le.

Before starting the procedure, ensure that you add the operating system images for both x86_64 and ppc64le architectures to the AgentServiceConfig resource. After this, you can use the InfraEnv resources to get the minimal ISO image.

Procedure

Create the InfraEnv resource with x86_64 architecture for heterogeneous node pools by running the following command:

$ envsubst <<"EOF" | oc apply -f -
apiVersion: agent-install.openshift.io/v1beta1
kind: InfraEnv
metadata:
  name: <hosted_cluster_name>-<arch_x86>  (1) (2)
  namespace: <hosted_control_plane_namespace> (3)
spec:
  cpuArchitecture: <arch_x86>
  pullSecretRef:
    name: pull-secret
  sshAuthorizedKey: <ssh_pub_key> (4)
EOF

1	The hosted cluster name.
2	The `x86_64` architecture.
3	The hosted control plane namespace.
4	The SSH public key.

Create the InfraEnv resource with ppc64le architecture for heterogeneous node pools by running the following command:

envsubst <<"EOF" | oc apply -f -
apiVersion: agent-install.openshift.io/v1beta1
kind: InfraEnv
metadata:
  name: <hosted_cluster_name>-<arch_ppc64le>  (1) (2)
  namespace: <hosted_control_plane_namespace> (3)
spec:
  cpuArchitecture: <arch_ppc64le>
  pullSecretRef:
    name: pull-secret
  sshAuthorizedKey: <ssh_pub_key> (4)
EOF

1	The hosted cluster name.
2	The `ppc64le` architecture.
3	The hosted control plane namespace.
4	The SSH public key.

Verify the successful creation of the InfraEnv resources by running the following commands:
- Verify the successful creation of the x86_64 InfraEnv resource:
  $ oc describe InfraEnv <hosted_cluster_name>-<arch_x86>
- Verify the successful creation of the ppc64le InfraEnv resource:
  $ oc describe InfraEnv <hosted_cluster_name>-<arch_ppc64le>

Generate a live ISO that allows either a virtual machine or a bare-metal machine to join as agents by running the following commands:

Generate a live ISO for x86_64:

$ oc -n <hosted_control_plane_namespace> get InfraEnv <hosted_cluster_name>-<arch_x86> -ojsonpath="{.status.isoDownloadURL}"

Generate a live ISO for ppc64le:

$ oc -n <hosted_control_plane_namespace> get InfraEnv <hosted_cluster_name>-<arch_ppc64le> -ojsonpath="{.status.isoDownloadURL}"

Adding agents to the heterogeneous cluster

You add agents by manually configuring the machine to boot with a live ISO. You can download the live ISO and use it to boot a bare-metal node or a virtual machine. On boot, the node communicates with the assisted-service and registers as an agent in the same namespace as the InfraEnv resource. After the creation of each agent, you can optionally set its installation_disk_id and hostname parameters in the specifications. You can then approve the agent to indicate the agent as ready for use.

Procedure

Obtain a list of agents by running the following command:

$ oc -n <hosted_control_plane_namespace> get agents

Example output

NAME                                   CLUSTER   APPROVED   ROLE          STAGE
86f7ac75-4fc4-4b36-8130-40fa12602218                        auto-assign
e57a637f-745b-496e-971d-1abbf03341ba                        auto-assign

Patch an agent by running the following command:

$ oc -n <hosted_control_plane_namespace> patch agent 86f7ac75-4fc4-4b36-8130-40fa12602218 -p '{"spec":{"installation_disk_id":"/dev/sda","approved":true,"hostname":"worker-0.example.krnl.es"}}' --type merge

Patch the second agent by running the following command:

$ oc -n <hosted_control_plane_namespace> patch agent 23d0c614-2caa-43f5-b7d3-0b3564688baa -p '{"spec":{"installation_disk_id":"/dev/sda","approved":true,"hostname":"worker-1.example.krnl.es"}}' --type merge

Check the agent approval status by running the following command:

$ oc -n <hosted_control_plane_namespace> get agents

Example output

NAME                                   CLUSTER   APPROVED   ROLE          STAGE
86f7ac75-4fc4-4b36-8130-40fa12602218             true       auto-assign
e57a637f-745b-496e-971d-1abbf03341ba             true       auto-assign

Scaling the node pool

After you approve your agents, you can scale the node pools. The agentLabelSelector value that you configured in the node pool ensures that only matching agents get added to the cluster. This also helps scale down the node pool. To remove specific architecture nodes from the cluster, scale down the corresponding node pool.

Procedure

Scale the node pool by running the following command:

$ oc -n <clusters_namespace> scale nodepool <nodepool_name> --replicas 2

The Cluster API agent provider picks two agents randomly to assign to the hosted cluster. These agents pass through different states and then join the hosted cluster as OKD nodes. The various agent states are binding, discovering, insufficient, installing, installing-in-progress, and added-to-existing-cluster.

Verification

List the agents by running the following command:

$ oc -n <hosted_control_plane_namespace> get agent

Example output

NAME                                   CLUSTER         APPROVED   ROLE          STAGE
4dac1ab2-7dd5-4894-a220-6a3473b67ee6   hypercluster1   true       auto-assign
d9198891-39f4-4930-a679-65fb142b108b                   true       auto-assign
da503cf1-a347-44f2-875c-4960ddb04091   hypercluster1   true       auto-assign

Check the status of a specific scaled agent by running the following command:

$ oc -n <hosted_control_plane_namespace> get agent -o jsonpath='{range .items[*]}BMH: {@.metadata.labels.agent-install\.openshift\.io/bmh} Agent: {@.metadata.name} State: {@.status.debugInfo.state}{"\n"}{end}'

Example output

BMH: ocp-worker-2 Agent: 4dac1ab2-7dd5-4894-a220-6a3473b67ee6 State: binding
BMH: ocp-worker-0 Agent: d9198891-39f4-4930-a679-65fb142b108b State: known-unbound
BMH: ocp-worker-1 Agent: da503cf1-a347-44f2-875c-4960ddb04091 State: insufficient

After the agents reach the added-to-existing-cluster state, verify that the OKD nodes are ready by running the following command:

$ oc --kubeconfig <hosted_cluster_name>.kubeconfig get nodes

Example output

NAME           STATUS   ROLES    AGE     VERSION
ocp-worker-1   Ready    worker   5m41s   v1.24.0+3882f8f
ocp-worker-2   Ready    worker   6m3s    v1.24.0+3882f8f

Adding workloads to the nodes can reconcile some cluster operators. The following command displays the creation of two machines that happened after scaling up the node pool:

$ oc -n <hosted_control_plane_namespace> get machines

Example output

NAME                            CLUSTER               NODENAME       PROVIDERID                                     PHASE     AGE   VERSION
hypercluster1-c96b6f675-m5vch   hypercluster1-b2qhl   ocp-worker-1   agent://da503cf1-a347-44f2-875c-4960ddb04091   Running   15m   4.11.5
hypercluster1-c96b6f675-tl42p   hypercluster1-b2qhl   ocp-worker-2   agent://4dac1ab2-7dd5-4894-a220-6a3473b67ee6   Running   15m   4.11.5