Creating a compute machine set on OpenStack - Managing compute machines with the Machine API | Machine management

Sample YAML for a compute machine set custom resource on OpenStack
Sample YAML for a compute machine set custom resource that uses SR-IOV on OpenStack
Sample YAML for SR-IOV deployments where port security is disabled
Creating a compute machine set
Labeling GPU machine sets for the cluster autoscaler

To automate the provisioning and scaling of node virtual machines (VMs) on OpenStack for compute workloads, create a MachineSet YAML file that defines details, for example image and network, that are specific to OpenStack.

You can create a different compute machine set to serve a specific purpose in your OKD cluster on OpenStack. For example, you might create infrastructure machine sets and related machines so that you can move supporting workloads to the new machines.

You can use the advanced machine management and scaling capabilities only in clusters where the Machine API is operational. Clusters with user-provisioned infrastructure require additional validation and configuration to use the Machine API.

Clusters with the infrastructure platform type none cannot use the Machine API. This limitation applies even if the compute machines that are attached to the cluster are installed on a platform that supports the feature. This parameter cannot be changed after installation.

To view the platform type for your cluster, run the following command:

$ oc get infrastructure cluster -o jsonpath='{.status.platform}'

Sample YAML for a compute machine set custom resource on OpenStack

To enable the Machine API to automate the scaling and management of compute nodes, define a MachineSet resource with OpenStack parameters, for example, image and network IDs.

The sample YAML defines a compute machine set that runs on OpenStack and creates nodes that are labeled with node-role.kubernetes.io/<role>: "".

In the sample, <infrastructure_id> is the infrastructure ID label that is based on the cluster ID that you set when you provisioned the cluster, and <role> is the node label to add.

apiVersion: machine.openshift.io/v1beta1
kind: MachineSet
metadata:
  labels:
    machine.openshift.io/cluster-api-cluster: <infrastructure_id>
    machine.openshift.io/cluster-api-machine-role: <role>
    machine.openshift.io/cluster-api-machine-type: <role>
  name: <infrastructure_id>-<role>
  namespace: openshift-machine-api
spec:
  replicas: <number_of_replicas>
  selector:
    matchLabels:
      machine.openshift.io/cluster-api-cluster: <infrastructure_id>
      machine.openshift.io/cluster-api-machineset: <infrastructure_id>-<role>
  template:
    metadata:
      labels:
        machine.openshift.io/cluster-api-cluster: <infrastructure_id>
        machine.openshift.io/cluster-api-machine-role: <role>
        machine.openshift.io/cluster-api-machine-type: <role>
        machine.openshift.io/cluster-api-machineset: <infrastructure_id>-<role>
    spec:
      providerSpec:
        value:
          apiVersion: machine.openshift.io/v1alpha1
          cloudName: openstack
          cloudsSecret:
            name: openstack-cloud-credentials
            namespace: openshift-machine-api
          flavor: <nova_flavor>
          image: <glance_image_name_or_location>
          serverGroupID: <optional_UUID_of_server_group>
          kind: OpenstackProviderSpec
          networks:
          - filter: {}
            subnets:
            - filter:
                name: <subnet_name>
                tags: openshiftClusterID=<infrastructure_id>
          primarySubnet: <rhosp_subnet_UUID>
          securityGroups:
          - filter: {}
            name: <infrastructure_id>-worker
          serverMetadata:
            Name: <infrastructure_id>-worker
            openshiftClusterID: <infrastructure_id>
          tags:
          - openshiftClusterID=<infrastructure_id>
          trunk: true
          userDataSecret:
            name: worker-user-data
          availabilityZone: <optional_openstack_availability_zone>

where:

<infrastructure_id>

Specifies the infrastructure ID that is based on the cluster ID that you set when you provisioned the cluster. If you have the OKD CLI installed, you can obtain the infrastructure ID by running the following command:

$ oc get -o jsonpath='{.status.infrastructureName}{"\n"}' infrastructure cluster

<role>

Specifies the node label to add.

<infrastructure_id>-<role>

Specifies the infrastructure ID and node label.

<optional_UUID_of_server_group>

Sets a server group policy for the MachineSet YAML by entering the value that is returned from creating a server group. For most deployments, anti-affinity or soft-anti-affinity policies are recommended.

<subnet_name>

Specifies a subnet to use.

The spec.template.spec.providerSpec.value.networks stanza is required for deployments to multiple networks. If deploying to multiple networks, this list must include the network that is used as the primarySubnet value.

<rhosp_subnet_UUID>

Specifies the OpenStack subnet that you want the endpoints of nodes to be published on. Usually, this is the same subnet that is used as the value of machinesSubnet in the install-config.yaml file.

Sample YAML for a compute machine set custom resource that uses SR-IOV on OpenStack

To provision compute virtual machines (VMs) with single root I/O virtualization (SR-IOV) for high-performance networking, define the SR-IOV ports directly in the providerSpec and set portSecurity to False.

If you configured your cluster for single-root SR-IOV, you can create compute machine sets that use that technology.

This sample YAML defines a compute machine set that uses SR-IOV networks. The nodes that it creates are labeled with node-role.openshift.io/<node_role>: ""

In this sample, infrastructure_id is the infrastructure ID label that is based on the cluster ID that you set when you provisioned the cluster, and node_role is the node label to add.

The sample assumes two SR-IOV networks that are named "radio" and "uplink". The networks are used in port definitions in the spec.template.spec.providerSpec.value.ports list.

Only parameters that are specific to SR-IOV deployments are described in this sample. To review a more general sample, see "Sample YAML for a compute machine set custom resource on OpenStack".

An example compute machine set that uses SR-IOV networks

apiVersion: machine.openshift.io/v1beta1
kind: MachineSet
metadata:
  labels:
    machine.openshift.io/cluster-api-cluster: <infrastructure_id>
    machine.openshift.io/cluster-api-machine-role: <node_role>
    machine.openshift.io/cluster-api-machine-type: <node_role>
  name: <infrastructure_id>-<node_role>
  namespace: openshift-machine-api
spec:
  replicas: <number_of_replicas>
  selector:
    matchLabels:
      machine.openshift.io/cluster-api-cluster: <infrastructure_id>
      machine.openshift.io/cluster-api-machineset: <infrastructure_id>-<node_role>
  template:
    metadata:
      labels:
        machine.openshift.io/cluster-api-cluster: <infrastructure_id>
        machine.openshift.io/cluster-api-machine-role: <node_role>
        machine.openshift.io/cluster-api-machine-type: <node_role>
        machine.openshift.io/cluster-api-machineset: <infrastructure_id>-<node_role>
    spec:
      metadata:
      providerSpec:
        value:
          apiVersion: machine.openshift.io/v1alpha1
          cloudName: openstack
          cloudsSecret:
            name: openstack-cloud-credentials
            namespace: openshift-machine-api
          flavor: <nova_flavor>
          image: <glance_image_name_or_location>
          serverGroupID: <optional_UUID_of_server_group>
          kind: OpenstackProviderSpec
          networks:
            - subnets:
              - UUID: <machines_subnet_UUID>
          ports:
            - networkID: <radio_network_UUID>
              nameSuffix: radio
              fixedIPs:
                - subnetID: <radio_subnet_UUID>
              tags:
                - sriov
                - radio
              vnicType: direct
              portSecurity: false
            - networkID: <uplink_network_UUID>
              nameSuffix: uplink
              fixedIPs:
                - subnetID: <uplink_subnet_UUID>
              tags:
                - sriov
                - uplink
              vnicType: direct
              portSecurity: false
          primarySubnet: <machines_subnet_UUID>
          securityGroups:
          - filter: {}
            name: <infrastructure_id>-<node_role>
          serverMetadata:
            Name: <infrastructure_id>-<node_role>
            openshiftClusterID: <infrastructure_id>
          tags:
          - openshiftClusterID=<infrastructure_id>
          trunk: true
          userDataSecret:
            name: <node_role>-user-data
          availabilityZone: <optional_openstack_availability_zone>
          configDrive: true

where:

<radio_network_UUID>: Specifies a network UUID for each port.
<radio_subnet_UUID>: Specifies a subnet UUID for each port.

The value of the spec.template.spec.providerSpec.value.ports.vnicType parameter must be direct for each port.

The value of the spec.template.spec.providerSpec.value.ports.portSecurity parameter must be false for each port. You cannot set security groups and allowed address pairs for ports when port security is disabled. Setting security groups on the instance applies the groups to all ports that are attached to it.

<uplink_network_UUID>: Specifies a network UUID for each port.
<uplink_subnet_UUID>: Specifies a subnet UUID for each port.

The value of the spec.template.spec.providerSpec.value.configDrive parameter must be true.

After you deploy compute machines that are SR-IOV-capable, you must label them as such. For example, from a command line, enter:

$ oc label node <NODE_NAME> feature.node.kubernetes.io/network-sriov.capable="true"

Trunking is enabled for ports that are created by entries in the networks and subnets lists. The names of ports that are created from these lists follow the pattern <machine_name>-<nameSuffix>. The nameSuffix field is required in port definitions.

You can enable trunking for each port.

Optionally, you can add tags to ports as part of their tags lists.

Additional resources

Preparing to install a cluster that uses SR-IOV or OVS-DPDK on OpenStack

Sample YAML for SR-IOV deployments where port security is disabled

To create single-root I/O virtualization (SR-IOV) ports on a network that has port security disabled, define a compute machine set that includes the ports as items in the spec.template.spec.providerSpec.value.ports list.

This difference from the standard SR-IOV compute machine set is due to the automatic security group and allowed address pair configuration that occurs for ports that are created by using the network and subnet interfaces.

Ports that you define for machines subnets require:

Allowed address pairs for the API and ingress virtual IP ports
The compute security group
Attachment to the machines network and subnet

Only parameters that are specific to SR-IOV deployments where port security is disabled are described in this sample. To review a more general sample, see "Sample YAML for a compute machine set custom resource that uses SR-IOV on OpenStack".

An example compute machine set that uses SR-IOV networks and has port security disabled

apiVersion: machine.openshift.io/v1beta1
kind: MachineSet
metadata:
  labels:
    machine.openshift.io/cluster-api-cluster: <infrastructure_id>
    machine.openshift.io/cluster-api-machine-role: <node_role>
    machine.openshift.io/cluster-api-machine-type: <node_role>
  name: <infrastructure_id>-<node_role>
  namespace: openshift-machine-api
spec:
  replicas: <number_of_replicas>
  selector:
    matchLabels:
      machine.openshift.io/cluster-api-cluster: <infrastructure_id>
      machine.openshift.io/cluster-api-machineset: <infrastructure_id>-<node_role>
  template:
    metadata:
      labels:
        machine.openshift.io/cluster-api-cluster: <infrastructure_id>
        machine.openshift.io/cluster-api-machine-role: <node_role>
        machine.openshift.io/cluster-api-machine-type: <node_role>
        machine.openshift.io/cluster-api-machineset: <infrastructure_id>-<node_role>
    spec:
      metadata: {}
      providerSpec:
        value:
          apiVersion: machine.openshift.io/v1alpha1
          cloudName: openstack
          cloudsSecret:
            name: openstack-cloud-credentials
            namespace: openshift-machine-api
          flavor: <nova_flavor>
          image: <glance_image_name_or_location>
          kind: OpenstackProviderSpec
          ports:
            - allowedAddressPairs:
              - ipAddress: <API_VIP_port_IP>
              - ipAddress: <ingress_VIP_port_IP>
              fixedIPs:
                - subnetID: <machines_subnet_UUID>
              nameSuffix: nodes
              networkID: <machines_network_UUID>
              securityGroups:
                  - <compute_security_group_UUID>
            - networkID: <SRIOV_network_UUID>
              nameSuffix: sriov
              fixedIPs:
                - subnetID: <SRIOV_subnet_UUID>
              tags:
                - sriov
              vnicType: direct
              portSecurity: False
          primarySubnet: <machines_subnet_UUID>
          serverMetadata:
            Name: <infrastructure_ID>-<node_role>
            openshiftClusterID: <infrastructure_id>
          tags:
          - openshiftClusterID=<infrastructure_id>
          trunk: false
          userDataSecret:
            name: worker-user-data
          configDrive: true

where:

<API_VIP_port_IP>: Specifies the allowed address for the API port. This value is paired with the allowed address for the ingress port.
<ingress_VIP_port_IP>: Specifies the allowed address for the ingress port. This value is paired with the allowed address for the API port.
<machines_subnet_UUID>: Specifies the machines subnet.
<machines_network_UUID>: Specifies the machines network.
<compute_security_group_UUID>: Specifies the compute machines security group.

The value of the spec.template.spec.providerSpec.value.configDrive parameter must be true.

You can enable trunking for each port.

Optionally, you can add tags to ports as part of their tags lists.

Creating a compute machine set

In addition to the compute machine sets created by the installation program, you can create your own compute machine sets to dynamically manage the machine compute resources for specific workloads of your choice. Use the OKD CLI to automate node provisioning.

Prerequisites

Deploy an OKD cluster.
Install the OpenShift CLI (oc).
Log in to oc as a user with cluster-admin permission.

Procedure

Create a new YAML file that contains the compute machine set custom resource (CR) sample and is named <file_name>.yaml.

Ensure that you set the <clusterID> and <role> parameter values.

Optional: If you are not sure which value to set for a specific field, you can check an existing compute machine set from your cluster.

To list the compute machine sets in your cluster, run the following command:

$ oc get machinesets -n openshift-machine-api

Example output

NAME                                DESIRED   CURRENT   READY   AVAILABLE   AGE
agl030519-vplxk-worker-us-east-1a   1         1         1       1           55m
agl030519-vplxk-worker-us-east-1b   1         1         1       1           55m
agl030519-vplxk-worker-us-east-1c   1         1         1       1           55m
agl030519-vplxk-worker-us-east-1d   0         0                             55m
agl030519-vplxk-worker-us-east-1e   0         0                             55m
agl030519-vplxk-worker-us-east-1f   0         0                             55m

To view values of a specific compute machine set custom resource (CR), run the following command:

$ oc get machineset <machineset_name> \
  -n openshift-machine-api -o yaml

Example output

apiVersion: machine.openshift.io/v1beta1
kind: MachineSet
metadata:
  labels:
    machine.openshift.io/cluster-api-cluster: <infrastructure_id>
  name: <infrastructure_id>-<role>
  namespace: openshift-machine-api
spec:
  replicas: 1
  selector:
    matchLabels:
      machine.openshift.io/cluster-api-cluster: <infrastructure_id>
      machine.openshift.io/cluster-api-machineset: <infrastructure_id>-<role>
  template:
    metadata:
      labels:
        machine.openshift.io/cluster-api-cluster: <infrastructure_id>
        machine.openshift.io/cluster-api-machine-role: <role>
        machine.openshift.io/cluster-api-machine-type: <role>
        machine.openshift.io/cluster-api-machineset: <infrastructure_id>-<role>
    spec:
      providerSpec:
        ...

where:

metadata.labels.machine.openshift.io/cluster-api-cluster

Specifies the cluster infrastructure ID.

metadata.labels.name

Specifies a default node label.

For clusters that have user-provisioned infrastructure, a compute machine set can only create worker and infra type machines.

spec.template.metadata.spec.providerSpec

Specifies the values of the compute machine set CR. The values are platform-specific. For more information about <providerSpec> parameters in the CR, see the sample compute machine set CR configuration for your provider.

Create a MachineSet CR by running the following command:
```
$ oc create -f <file_name>.yaml
```

Verification

View the list of compute machine sets by running the following command:

$ oc get machineset -n openshift-machine-api

Example output

NAME                                DESIRED   CURRENT   READY   AVAILABLE   AGE
agl030519-vplxk-infra-us-east-1a    1         1         1       1           11m
agl030519-vplxk-worker-us-east-1a   1         1         1       1           55m
agl030519-vplxk-worker-us-east-1b   1         1         1       1           55m
agl030519-vplxk-worker-us-east-1c   1         1         1       1           55m
agl030519-vplxk-worker-us-east-1d   0         0                             55m
agl030519-vplxk-worker-us-east-1e   0         0                             55m
agl030519-vplxk-worker-us-east-1f   0         0                             55m

When the new compute machine set is available, the DESIRED and CURRENT values match. If the compute machine set is not available, wait a few minutes and run the command again.

Labeling GPU machine sets for the cluster autoscaler

Label your machine sets to indicate which machines the cluster autoscaler can use for GPU-enabled nodes. Applying the accelerator label helps ensure that the autoscaler deploys the correct resources for your GPU workloads.

Prerequisites

Your cluster uses a cluster autoscaler.

Procedure

On the machine set that you want to create machines for the cluster autoscaler to use to deploy GPU-enabled nodes, add a cluster-api/accelerator label:
apiVersion: machine.openshift.io/v1beta1 kind: MachineSet metadata: name: machine-set-name spec: template: spec: metadata: labels: cluster-api/accelerator: <accelerator_name>
where:

<accelerator_name>

Specifies a label of your choice that consists of alphanumeric characters, -, _, or . and starts and ends with an alphanumeric character. For example, you might use nvidia-t4 to represent Nvidia T4 GPUs, or nvidia-a10g for A10G GPUs.

You must specify the value of this label for the spec.resourceLimits.gpus.type parameter in your ClusterAutoscaler CR. For more information, see "Cluster autoscaler resource definition".

Additional resources

Cluster autoscaler resource definition