Working with nodes - Working with nodes | Nodes

Understanding how to update labels on nodes
Understanding how to mark nodes as unschedulable or schedulable
Handling errors in single-node OpenShift clusters when the node reboots without draining application pods
Deleting nodes from a cluster
Deleting nodes from a bare metal cluster

As an administrator, you can perform several tasks to make your clusters more efficient.

Understanding how to update labels on nodes

You can update any label on a node in order to adapt your cluster to evolving needs.

Node labels are not persisted after a node is deleted even if the node is backed up by a Machine.

Any change to a MachineSet object is not applied to existing machines owned by the compute machine set. For example, labels edited or added to an existing MachineSet object are not propagated to existing machines and nodes associated with the compute machine set.

The following command adds or updates labels on a node:

$ oc label node <node> <key_1>=<value_1> ... <key_n>=<value_n>

For example:

$ oc label nodes webconsole-7f7f6 unhealthy=true

You can alternatively apply the following YAML to apply the label:

kind: Node
apiVersion: v1
metadata:
  name: webconsole-7f7f6
  labels:
    unhealthy: 'true'
#...

The following command updates all pods in the namespace:

$ oc label pods --all <key_1>=<value_1>

For example:

$ oc label pods --all status=unhealthy

In OKD 4.12 and later, newly installed clusters include both the node-role.kubernetes.io/control-plane and node-role.kubernetes.io/master labels on control plane nodes by default.

In OKD versions earlier than 4.12, the node-role.kubernetes.io/control-plane label is not added by default. Therefore, you must manually add the node-role.kubernetes.io/control-plane label to control plane nodes in clusters upgraded from earlier versions.

Understanding how to mark nodes as unschedulable or schedulable

You can mark a node as unschedulable in order to block any new pods from being scheduled on the node.

When you mark a node as unschedulable, existing pods on the node are not affected.

By default, healthy nodes with a Ready status are marked as schedulable, which means that you can place new pods on the node.

The following command marks a node or nodes as unschedulable:

Example output

$ oc adm cordon <node>

For example:

$ oc adm cordon node1.example.com

Example output

node/node1.example.com cordoned

NAME                 LABELS                                        STATUS
node1.example.com    kubernetes.io/hostname=node1.example.com      Ready,SchedulingDisabled

The following command marks a currently unschedulable node or nodes as schedulable:
```
$ oc adm uncordon <node1>
```
Instead of specifying specific node names (for example, <node>), you can use the --selector=<node_selector> option to mark selected nodes as schedulable or unschedulable.

Handling errors in single-node OpenShift clusters when the node reboots without draining application pods

You can remove failed pods from a node by using the --field-selector status.phase=Failed flag with the oc delete pods command.

In single-node OpenShift clusters and in OKD clusters in general, a situation can arise where a node reboot occurs without first draining the node. This can occur where an application pod requesting devices fails with the UnexpectedAdmissionError error. Deployment, ReplicaSet, or DaemonSet errors are reported because the application pods that require those devices start before the pod serving those devices. You cannot control the order of pod restarts.

While this behavior is to be expected, it can cause a pod to remain on the cluster even though it has failed to deploy successfully. The pod continues to report UnexpectedAdmissionError. This issue is mitigated by the fact that application pods are typically included in a Deployment, ReplicaSet, or DaemonSet. If a pod is in this error state, it is of little concern because another instance should be running. Belonging to a Deployment, ReplicaSet, or DaemonSet guarantees the successful creation and execution of subsequent pods and ensures the successful deployment of the application.

There is ongoing work upstream to ensure that such pods are gracefully terminated. Until that work is resolved, run the following command for a single-node OpenShift cluster to remove the failed pods:

$ oc delete pods --field-selector status.phase=Failed -n <POD_NAMESPACE>

The option to drain the node is unavailable for single-node OpenShift clusters.

Deleting nodes from a cluster

You can delete a node from a OKD cluster by scaling down the appropriate MachineSet object.

When a cluster is integrated with a cloud provider, you must delete the corresponding machine to delete a node. Do not try to use the oc delete node command for this task.

When you delete a node by using the CLI, the node object is deleted in Kubernetes, but the pods that exist on the node are not deleted. Any bare pods that are not backed by a replication controller become inaccessible to OKD. Pods backed by replication controllers are rescheduled to other available nodes. You must delete local manifest pods.

If you are running cluster on bare metal, you cannot delete a node by editing MachineSet objects. Compute machine sets are only available when a cluster is integrated with a cloud provider. Instead you must unschedule and drain the node before manually deleting it.

Procedure

View the compute machine sets that are in the cluster by running the following command:
```
$ oc get machinesets -n openshift-machine-api
```
The compute machine sets are listed in the form of <cluster-id>-worker-<aws-region-az>.

Scale down the compute machine set by using one of the following methods:

Specify the number of replicas to scale down to by running the following command:

$ oc scale --replicas=2 machineset <machine-set-name> -n openshift-machine-api

Edit the compute machine set custom resource by running the following command:

$ oc edit machineset <machine-set-name> -n openshift-machine-api

Example output

apiVersion: machine.openshift.io/v1beta1
kind: MachineSet
metadata:
  # ...
  name: <machine-set-name>
  namespace: openshift-machine-api
  # ...
spec:
  replicas: 2
  # ...

where:

spec.replicas: Specifies the number of replicas to scale down to.

Deleting nodes from a bare metal cluster

You can delete a node from a OKD cluster that does not use machine sets by using the oc delete node command and decommissioning the node.

When you delete a node using the CLI, the node object is deleted in Kubernetes, but the pods that exist on the node are not deleted. Any bare pods not backed by a replication controller become inaccessible to OKD. Pods backed by replication controllers are rescheduled to other available nodes. You must delete local manifest pods.

The following procedure deletes a node from an OKD cluster running on bare metal.

Procedure

Mark the node as unschedulable:
```
$ oc adm cordon <node_name>
```
Drain all pods on the node:
```
$ oc adm drain <node_name> --force=true
```
This step might fail if the node is offline or unresponsive. Even if the node does not respond, the node might still be running a workload that writes to shared storage. To avoid data corruption, power down the physical hardware before you proceed.
Delete the node from the cluster:
```
$ oc delete node <node_name>
```
Although the node object is now deleted from the cluster, it can still rejoin the cluster after reboot or if the kubelet service is restarted. To permanently delete the node and all its data, you must decommission the node.
If you powered down the physical hardware, turn it back on so that the node can rejoin the cluster.

Additional resources