Creating a compute machine set on Nutanix - Managing compute machines with the Machine API | Machine management

Sample YAML for a compute machine set custom resource on Nutanix
- Values obtained by using the OpenShift CLI
Creating a compute machine set
Labeling GPU machine sets for the cluster autoscaler
Failure domains for Nutanix clusters

You can create a different compute machine set to serve a specific purpose in your OKD cluster on Nutanix. For example, you might create infrastructure machine sets and related machines so that you can move supporting workloads to the new machines, which helps ensure efficient resource allocation.

You can use the advanced machine management and scaling capabilities only in clusters where the Machine API is operational. Clusters with user-provisioned infrastructure require additional validation and configuration to use the Machine API.

Clusters with the infrastructure platform type none cannot use the Machine API. This limitation applies even if the compute machines that are attached to the cluster are installed on a platform that supports the feature. This parameter cannot be changed after installation.

To view the platform type for your cluster, run the following command:

$ oc get infrastructure cluster -o jsonpath='{.status.platform}'

Sample YAML for a compute machine set custom resource on Nutanix

You can use a YAML file to automate node provisioning and ensure workloads are scheduled correctly based on role and infrastructure requirements.

The sample YAML shows how to define a Nutanix compute MachineSet for your cluster. It explains how to configure roles, labels, sizing, networking, and boot settings so new nodes are created consistently.

The sample YAML defines a Nutanix compute machine set that creates nodes that are labeled with node-role.kubernetes.io/<role>: "".

In the sample, <infrastructure_id> is the infrastructure ID label that is based on the cluster ID that you set when you provisioned the cluster, and <role> is the node label to add.

Values obtained by using the OpenShift CLI

In the following example, you can obtain some of the values for your cluster by using the OpenShift CLI (oc).

Infrastructure ID

The <infrastructure_id> string is the infrastructure ID that is based on the cluster ID that you set when you provisioned the cluster. If you have the OpenShift CLI installed, you can obtain the infrastructure ID by running the following command:

$ oc get -o jsonpath='{.status.infrastructureName}{"\n"}' infrastructure cluster

apiVersion: machine.openshift.io/v1beta1
kind: MachineSet
metadata:
  labels:
    machine.openshift.io/cluster-api-cluster: <infrastructure_id>
    machine.openshift.io/cluster-api-machine-role: <role>
    machine.openshift.io/cluster-api-machine-type: <role>
  name: <infrastructure_id>-<role>-<zone>
  namespace: openshift-machine-api
  annotations:
    machine.openshift.io/memoryMb: "16384"
    machine.openshift.io/vCPU: "4"
spec:
  replicas: 3
  selector:
    matchLabels:
      machine.openshift.io/cluster-api-cluster: <infrastructure_id>
      machine.openshift.io/cluster-api-machineset: <infrastructure_id>-<role>-<zone>
  template:
    metadata:
      labels:
        machine.openshift.io/cluster-api-cluster: <infrastructure_id>
        machine.openshift.io/cluster-api-machine-role: <role>
        machine.openshift.io/cluster-api-machine-type: <role>
        machine.openshift.io/cluster-api-machineset: <infrastructure_id>-<role>-<zone>
    spec:
      metadata:
        labels:
          node-role.kubernetes.io/<role>: ""
      providerSpec:
        value:
          apiVersion: machine.openshift.io/v1
          bootType: ""
          categories:
          - key: <category_name>
            value: <category_value>
          cluster:
            type: uuid
            uuid: <cluster_uuid>
          credentialsSecret:
            name: nutanix-credentials
          image:
            name: <infrastructure_id>-rhcos
            type: name
          kind: NutanixMachineProviderConfig
          memorySize: 16Gi
          project:
            type: name
            name: <project_name>
          subnets:
          - type: uuid
            uuid: <subnet_uuid>
          systemDiskSize: 120Gi
          userDataSecret:
            name: <user_data_secret>
          vcpuSockets: 4
          vcpusPerSocket: 1

where:

<infrastructure_id>

Specifies the infrastructure ID that is based on the cluster ID that you set when you provisioned the cluster.

<role>

Specifies the node label to add.

<infrastructure_id>-<infra>-<region>

Specifies the infrastructure ID, node label, and zone.

annotations

Specifies annotations for the cluster autoscaler.

bootType

Specifies the boot type that the compute machines use. For more information about boot types, see Understanding UEFI, Secure Boot, and TPM in the Virtualized Environment. Valid values are Legacy, SecureBoot, or UEFI. The default is Legacy.

You must use the Legacy boot type in OKD 4.16.

<categories>

Specifies one or more Nutanix Prism categories to apply to compute machines. This stanza requires key and value parameters for a category key-value pair that exists in Prism Central. For more information about categories, see Category management.

<cluster>

Specifies a Nutanix Prism Element cluster configuration. In this example, the cluster type is uuid, so there is a uuid stanza.

<infrastructure_id>-rhcos

Specifies the image to use. Use an image from an existing default compute machine set for the cluster.

16Gi

Specifies the amount of memory for the cluster in Gi.

project

Specifies the Nutanix project that you use for your cluster. In this example, the project type is name, so there is a name stanza.

subnets

Specifies one or more UUID for the Prism Element subnet object. The CIDR IP address prefix for one of the specified subnets must contain the virtual IP addresses that the OKD cluster uses. A maximum of 32 subnets for each Prism Element failure domain in the cluster is supported. All subnet UUID values must be unique.

120Gi

Specifies the size of the system disk in Gi.

<user_data_secret>

Specifies the name of the secret in the user data YAML file that is in the openshift-machine-api namespace. Use the value that installation program populates in the default compute machine set.

4

Specifies the number of vCPU sockets.

1

Specifies the number of vCPUs per socket.

Creating a compute machine set

In addition to the compute machine sets created by the installation program, you can create your own to dynamically manage the machine compute resources for specific workloads of your choice.

Prerequisites

Deploy an OKD cluster.
Install the OpenShift CLI (oc).
Log in to oc as a user with cluster-admin permission.

Procedure

Create a new YAML file that contains the compute machine set custom resource (CR) sample and is named <file_name>.yaml.

Ensure that you set the <clusterID> and <role> parameter values.

Optional: If you are not sure which value to set for a specific field, you can check an existing compute machine set from your cluster.

To list the compute machine sets in your cluster, run the following command:

$ oc get machinesets -n openshift-machine-api

Example output

NAME                                DESIRED   CURRENT   READY   AVAILABLE   AGE
agl030519-vplxk-worker-us-east-1a   1         1         1       1           55m
agl030519-vplxk-worker-us-east-1b   1         1         1       1           55m
agl030519-vplxk-worker-us-east-1c   1         1         1       1           55m
agl030519-vplxk-worker-us-east-1d   0         0                             55m
agl030519-vplxk-worker-us-east-1e   0         0                             55m
agl030519-vplxk-worker-us-east-1f   0         0                             55m

To view values of a specific compute machine set custom resource (CR), run the following command:

$ oc get machineset <machineset_name> \
  -n openshift-machine-api -o yaml

Example output

apiVersion: machine.openshift.io/v1beta1
kind: MachineSet
metadata:
  labels:
    machine.openshift.io/cluster-api-cluster: <infrastructure_id> (1)
  name: <infrastructure_id>-<role> (2)
  namespace: openshift-machine-api
spec:
  replicas: 1
  selector:
    matchLabels:
      machine.openshift.io/cluster-api-cluster: <infrastructure_id>
      machine.openshift.io/cluster-api-machineset: <infrastructure_id>-<role>
  template:
    metadata:
      labels:
        machine.openshift.io/cluster-api-cluster: <infrastructure_id>
        machine.openshift.io/cluster-api-machine-role: <role>
        machine.openshift.io/cluster-api-machine-type: <role>
        machine.openshift.io/cluster-api-machineset: <infrastructure_id>-<role>
    spec:
      providerSpec: (3)
        ...

1 The cluster infrastructure ID.

A default node label.

For clusters that have user-provisioned infrastructure, a compute machine set can only create worker and infra type machines.

3 The values in the <providerSpec> section of the compute machine set CR are platform-specific. For more information about <providerSpec> parameters in the CR, see the sample compute machine set CR configuration for your provider.

Create a MachineSet CR by running the following command:
```
$ oc create -f <file_name>.yaml
```

Verification

View the list of compute machine sets by running the following command:

$ oc get machineset -n openshift-machine-api

Example output

NAME                                DESIRED   CURRENT   READY   AVAILABLE   AGE
agl030519-vplxk-infra-us-east-1a    1         1         1       1           11m
agl030519-vplxk-worker-us-east-1a   1         1         1       1           55m
agl030519-vplxk-worker-us-east-1b   1         1         1       1           55m
agl030519-vplxk-worker-us-east-1c   1         1         1       1           55m
agl030519-vplxk-worker-us-east-1d   0         0                             55m
agl030519-vplxk-worker-us-east-1e   0         0                             55m
agl030519-vplxk-worker-us-east-1f   0         0                             55m

When the new compute machine set is available, the DESIRED and CURRENT values match. If the compute machine set is not available, wait a few minutes and run the command again.

Labeling GPU machine sets for the cluster autoscaler

You can use a machine set label to indicate which machines the cluster autoscaler can use to deploy GPU-enabled nodes.

Prerequisites

Your cluster uses a cluster autoscaler.

Procedure

On the machine set that you want to create machines for the cluster autoscaler to use to deploy GPU-enabled nodes, add a cluster-api/accelerator label:
apiVersion: machine.openshift.io/v1beta1 kind: MachineSet metadata: name: machine-set-name spec: template: spec: metadata: labels: cluster-api/accelerator: <accelerator_name>
where:

<accelerator_name>

Specifies a label of your choice that consists of alphanumeric characters, -, _, or . and starts and ends with an alphanumeric character. For example, you might use nvidia-t4 to represent Nvidia T4 GPUs, or nvidia-a10g for A10G GPUs.

You must specify the value of this label for the spec.resourceLimits.gpus.type parameter in your ClusterAutoscaler CR. For more information, see "Cluster autoscaler resource definition".

Failure domains for Nutanix clusters

Update failure domain configurations on a Nutanix cluster by coordinating changes to specific resources. You must modify the cluster infrastructure, control plane machine set, and compute machine set custom resources (CRs) to apply the new configuration.

To add or update the failure domain configuration on a Nutanix cluster, you must make coordinated changes to several resources. The following actions are required:

Modify the cluster infrastructure custom resource (CR).
Modify the cluster control plane machine set CR.
Modify or replace the compute machine set CRs.

For more information, see "Adding failure domains to an existing Nutanix cluster" in the Post-installation configuration content.

Additional resources