Creating a compute machine set on IBM Cloud - Managing compute machines with the Machine API | Machine management

Sample YAML for a compute machine set custom resource on IBM Cloud
Creating a compute machine set
Labeling GPU machine sets for the cluster autoscaler

You can create compute machine sets in your OKD cluster on IBM Cloud® to perform specific tasks. For example, you might create infrastructure machine sets and related machines so that you can move supporting workloads to the new machines. Moving supporting workloads to dedicated machines helps ensure that your cluster resources are allocated efficiently.

You can use the advanced machine management and scaling capabilities only in clusters where the Machine API is operational. Clusters with user-provisioned infrastructure require additional validation and configuration to use the Machine API.

Clusters with the infrastructure platform type none cannot use the Machine API. This limitation applies even if the compute machines that are attached to the cluster are installed on a platform that supports the feature. This parameter cannot be changed after installation.

To view the platform type for your cluster, run the following command:

$ oc get infrastructure cluster -o jsonpath='{.status.platform}'

Sample YAML for a compute machine set custom resource on IBM Cloud

You can use the sample YAML file to automate the provisioning of compute or infrastructure nodes within a specific Virtual Private Cloud (VPC). The sample YAML defines a compute machine set that runs in a specified IBM Cloud® zone in a region and creates nodes that are labeled with node-role.kubernetes.io/<role>: "".

In the sample, <infrastructure_id> is the infrastructure ID label that is based on the cluster ID that you set when you provisioned the cluster, and <role> is the node label to add.

apiVersion: machine.openshift.io/v1beta1
kind: MachineSet
metadata:
  labels:
    machine.openshift.io/cluster-api-cluster: <infrastructure_id>
    machine.openshift.io/cluster-api-machine-role: <role>
    machine.openshift.io/cluster-api-machine-type: <role>
  name: <infrastructure_id>-<role>-<region>
  namespace: openshift-machine-api
spec:
  replicas: 1
  selector:
    matchLabels:
      machine.openshift.io/cluster-api-cluster: <infrastructure_id>
      machine.openshift.io/cluster-api-machineset: <infrastructure_id>-<role>-<region>
  template:
    metadata:
      labels:
        machine.openshift.io/cluster-api-cluster: <infrastructure_id>
        machine.openshift.io/cluster-api-machine-role: <role>
        machine.openshift.io/cluster-api-machine-type: <role>
        machine.openshift.io/cluster-api-machineset: <infrastructure_id>-<role>-<region>
    spec:
      metadata:
        labels:
          node-role.kubernetes.io/<role>: ""
      providerSpec:
        value:
          apiVersion: ibmcloudproviderconfig.openshift.io/v1beta1
          credentialsSecret:
            name: ibmcloud-credentials
          image: <infrastructure_id>-rhcos
          kind: IBMCloudMachineProviderSpec
          primaryNetworkInterface:
              securityGroups:
              - <infrastructure_id>-sg-cluster-wide
              - <infrastructure_id>-sg-openshift-net
              subnet: <infrastructure_id>-subnet-compute-<zone>
          profile: <instance_profile>
          region: <region>
          resourceGroup: <resource_group>
          userDataSecret:
              name: <role>-user-data
          vpc: <vpc_name>
          zone: <zone>

where:

<infrastructure_id>

Specifies the infrastructure ID that is based on the cluster ID that you set when you provisioned the cluster. If you have the OpenShift CLI installed, you can obtain the infrastructure ID by running the following command:

$ oc get -o jsonpath='{.status.infrastructureName}{"\n"}' infrastructure cluster

<role>

Specifies the node label to add.

<infrastructure_id>-<role>-<region>

Specifies the infrastructure ID, node label, and region.

<infrastructure_id>-rhcos

Specifies the custom Fedora CoreOS (FCOS) image to use as a boot image for your nodes. You should use the use the latest image when adding a new machine set.

<infrastructure_id>-subnet-compute-<zone>

Specifies the infrastructure ID and zone within your region to place machines on. Be sure that your region supports the zone that you specify.

<instance_profile>

Specifies the IBM Cloud® instance profile.

<region>

Specifies the region to place machines on.

<resource_group>

Specifies the resource group that machine resources are placed in. This is either an existing resource group specified at installation time, or an installer-created resource group named based on the infrastructure ID.

<vpc_name>

Specifies the VPC name.

<zone>

Specifies the zone within your region to place machines on. Be sure that your region supports the zone that you specify.

Additional resources

Manually updating the boot image

Creating a compute machine set

To dynamically manage machine compute resources, you can create your own compute machine sets in addition to the compute machine sets created by the installation program. Use the OKD CLI to automate node provisioning.

Prerequisites

Deploy an OKD cluster.
Install the OpenShift CLI (oc).
Log in to oc as a user with cluster-admin permission.

Procedure

Create a new YAML file that contains the compute machine set custom resource (CR) sample and is named <file_name>.yaml.

Ensure that you set the <clusterID> and <role> parameter values.

Optional: If you are not sure which value to set for a specific field, you can check an existing compute machine set from your cluster.

To list the compute machine sets in your cluster, run the following command:

$ oc get machinesets -n openshift-machine-api

The following is example output:

NAME                                DESIRED   CURRENT   READY   AVAILABLE   AGE
agl030519-vplxk-worker-us-east-1a   1         1         1       1           55m
agl030519-vplxk-worker-us-east-1b   1         1         1       1           55m
agl030519-vplxk-worker-us-east-1c   1         1         1       1           55m
agl030519-vplxk-worker-us-east-1d   0         0                             55m
agl030519-vplxk-worker-us-east-1e   0         0                             55m
agl030519-vplxk-worker-us-east-1f   0         0                             55m

To view values of a specific compute machine set custom resource (CR), run the following command:

$ oc get machineset <machineset_name> \
  -n openshift-machine-api -o yaml

The following is example output:

apiVersion: machine.openshift.io/v1beta1
kind: MachineSet
metadata:
  labels:
    machine.openshift.io/cluster-api-cluster: <infrastructure_id>
  name: <infrastructure_id>-<role>
  namespace: openshift-machine-api
spec:
  replicas: 1
  selector:
    matchLabels:
      machine.openshift.io/cluster-api-cluster: <infrastructure_id>
      machine.openshift.io/cluster-api-machineset: <infrastructure_id>-<role>
  template:
    metadata:
      labels:
        machine.openshift.io/cluster-api-cluster: <infrastructure_id>
        machine.openshift.io/cluster-api-machine-role: <role>
        machine.openshift.io/cluster-api-machine-type: <role>
        machine.openshift.io/cluster-api-machineset: <infrastructure_id>-<role>
    spec:
      providerSpec:
        ...

where:

metadata.labels.machine.openshift.io/cluster-api-cluster

Specifies the cluster infrastructure ID.

metadata.labels.name

Specifies a default node label.

For clusters that have user-provisioned infrastructure, a compute machine set can only create worker and infra type machines.

spec.template.metadata.spec.providerSpec

Specifies the values of the compute machine set CR. The values are platform-specific. For more information about <providerSpec> parameters in the CR, see the sample compute machine set CR configuration for your provider.

Create a MachineSet CR by running the following command:
```
$ oc create -f <file_name>.yaml
```

Verification

View the list of compute machine sets by running the following command:

$ oc get machineset -n openshift-machine-api

The following is example output:

NAME                                DESIRED   CURRENT   READY   AVAILABLE   AGE
agl030519-vplxk-infra-us-east-1a    1         1         1       1           11m
agl030519-vplxk-worker-us-east-1a   1         1         1       1           55m
agl030519-vplxk-worker-us-east-1b   1         1         1       1           55m
agl030519-vplxk-worker-us-east-1c   1         1         1       1           55m
agl030519-vplxk-worker-us-east-1d   0         0                             55m
agl030519-vplxk-worker-us-east-1e   0         0                             55m
agl030519-vplxk-worker-us-east-1f   0         0                             55m

When the new compute machine set is available, the DESIRED and CURRENT values match. If the compute machine set is not available, wait a few minutes and run the command again.

Labeling GPU machine sets for the cluster autoscaler

Label your machine sets to indicate which machines the cluster autoscaler can use for GPU-enabled nodes. Applying the accelerator label helps ensure that the autoscaler deploys the correct resources for your GPU workloads.

Prerequisites

Your cluster uses a cluster autoscaler.

Procedure

On the machine set that you want to create machines for the cluster autoscaler to use to deploy GPU-enabled nodes, add a cluster-api/accelerator label:
apiVersion: machine.openshift.io/v1beta1 kind: MachineSet metadata: name: machine-set-name spec: template: spec: metadata: labels: cluster-api/accelerator: <accelerator_name>
where:

<accelerator_name>

Specifies a label of your choice that consists of alphanumeric characters, -, _, or . and starts and ends with an alphanumeric character. For example, you might use nvidia-t4 to represent Nvidia T4 GPUs, or nvidia-a10g for A10G GPUs.

You must specify the value of this label for the spec.resourceLimits.gpus.type parameter in your ClusterAutoscaler CR. For more information, see "Cluster autoscaler resource definition".

Additional resources

Cluster autoscaler resource definition