$ oc get machinehealthcheck -n openshift-machine-api
You can update, or upgrade, an OKD cluster within a minor version by using the OpenShift CLI (oc
). You can also update a cluster between minor versions by following the same instructions.
Have access to the cluster as a user with admin
privileges.
See Using RBAC to define and apply permissions.
Have a recent etcd backup in case your update fails and you must restore your cluster to a previous state.
Ensure all Operators previously installed through Operator Lifecycle Manager (OLM) are updated to their latest version in their latest channel. Updating the Operators ensures they have a valid update path when the default OperatorHub catalogs switch from the current minor version to the next during a cluster update. See Updating installed Operators for more information.
If your cluster uses manually maintained credentials, ensure that the Cloud Credential Operator (CCO) is in an upgradeable state. For more information, see Upgrading clusters with manually maintained credentials for AWS, Azure, or GCP.
If your cluster uses manually maintained credentials with the AWS Security Token Service (STS), obtain a copy of the ccoctl
utility from the release image being updated to and use it to process any updated credentials. For more information, see Upgrading an OpenShift Container Platform cluster configured for manual mode with STS.
Ensure that you address all Upgradeable=False
conditions so the cluster allows an update to the next minor version. You can run the oc adm upgrade
command for an output of all Upgradeable=False
conditions and the condition reasoning to help you prepare for a minor version update.
If you run an Operator or you have configured any application with the pod disruption budget, you might experience an interruption during the upgrade process. If minAvailable
is set to 1 in PodDisruptionBudget
, the nodes are drained to apply pending machine configs which might block the eviction process. If several nodes are rebooted, all the pods might run on only one node, and the PodDisruptionBudget
field can prevent the node drain.
|
During the upgrade process, nodes in the cluster might become temporarily unavailable. In the case of worker nodes, the machine health check might identify such nodes as unhealthy and reboot them. To avoid rebooting such nodes, pause all the MachineHealthCheck
resources before updating the cluster.
Install the OpenShift CLI (oc
).
To list all the available MachineHealthCheck
resources that you want to pause, run the following command:
$ oc get machinehealthcheck -n openshift-machine-api
To pause the machine health checks, add the cluster.x-k8s.io/paused=""
annotation to the MachineHealthCheck
resource. Run the following command:
$ oc -n openshift-machine-api annotate mhc <mhc-name> cluster.x-k8s.io/paused=""
The annotated MachineHealthCheck
resource resembles the following YAML file:
apiVersion: machine.openshift.io/v1beta1
kind: MachineHealthCheck
metadata:
name: example
namespace: openshift-machine-api
annotations:
cluster.x-k8s.io/paused: ""
spec:
selector:
matchLabels:
role: worker
unhealthyConditions:
- type: "Ready"
status: "Unknown"
timeout: "300s"
- type: "Ready"
status: "False"
timeout: "300s"
maxUnhealthy: "40%"
status:
currentHealthy: 5
expectedMachines: 5
Resume the machine health checks after updating the cluster. To resume the check, remove the pause annotation from the
|
You can update, or upgrade, a single-node OKD cluster by using either the console or CLI.
However, note the following limitations:
The prerequisite to pause the MachineHealthCheck
resources is not required because there is no other node to perform the health check.
Restoring a single-node OKD cluster using an etcd backup is not officially supported. However, it is good practice to perform the etcd backup in case your upgrade fails. If your control plane is healthy, you might be able to restore your cluster to a previous state by using the backup.
Updating a single-node OKD cluster requires downtime and can include an automatic reboot. The amount of downtime depends on the update payload, as described in the following scenarios:
If the update payload contains an operating system update, which requires a reboot, the downtime is significant and impacts cluster management and user workloads.
If the update contains machine configuration changes that do not require a reboot, the downtime is less, and the impact on the cluster management and user workloads is lessened. In this case, the node draining step is skipped with single-node OKD because there is no other node in the cluster to reschedule the workloads to.
If the update payload does not contain an operating system update or machine configuration changes, a short API outage occurs and resolves quickly.
There are conditions, such as bugs in an updated package, that can cause the single node to not restart after a reboot. In this case, the update does not rollback automatically. |
For information on which machine configuration changes require a reboot, see the note in Understanding the Machine Config Operator.
If updates are available, you can update your cluster by using the
OpenShift CLI (oc
).
You can find information about available OKD advisories and updates in the errata section of the Customer Portal.
Install the OpenShift CLI (oc
) that matches the version for your updated version.
Log in to the cluster as user with cluster-admin
privileges.
Install the jq
package.
Pause all MachineHealthCheck
resources.
Ensure that your cluster is available:
$ oc get clusterversion
NAME VERSION AVAILABLE PROGRESSING SINCE STATUS
version 4.8.13 True False 158m Cluster version is 4.8.13
Review the current update channel information and confirm that your channel
is set to stable-4.9
:
$ oc get clusterversion -o json|jq ".items[0].spec"
{
"channel": "stable-4.9",
"clusterID": "990f7ab8-109b-4c95-8480-2bd1deec55ff"
}
For production clusters, you must subscribe to a |
View the available updates and note the version number of the update that you want to apply:
$ oc adm upgrade
Cluster version is 4.8.13
Updates:
VERSION IMAGE
4.9.0 quay.io/openshift-release-dev/ocp-release@sha256:9c5f0df8b192a0d7b46cd5f6a4da2289c155fd5302dec7954f8f06c878160b8b
Apply an update:
To update to the latest version:
$ oc adm upgrade --to-latest=true (1)
To update to a specific version:
$ oc adm upgrade --to=<version> (1)
1 | <version> is the update version that you obtained from the output of the
previous command. |
Review the status of the Cluster Version Operator:
$ oc get clusterversion -o json|jq ".items[0].spec"
{
"channel": "stable-4.9",
"clusterID": "990f7ab8-109b-4c95-8480-2bd1deec55ff",
"desiredUpdate": {
"force": false,
"image": "quay.io/openshift-release-dev/ocp-release@sha256:9c5f0df8b192a0d7b46cd5f6a4da2289c155fd5302dec7954f8f06c878160b8b",
"version": "4.9.0" (1)
}
}
1 | If the version number in the desiredUpdate stanza matches the value that
you specified, the update is in progress. |
Review the cluster version status history to monitor the status of the update. It might take some time for all the objects to finish updating.
$ oc get clusterversion -o json|jq ".items[0].status.history"
[
{
"completionTime": null,
"image": "quay.io/openshift-release-dev/ocp-release@sha256:b8fa13e09d869089fc5957c32b02b7d3792a0b6f36693432acc0409615ab23b7",
"startedTime": "2021-01-28T20:30:50Z",
"state": "Partial",
"verified": true,
"version": "4.9.0"
},
{
"completionTime": "2021-01-28T20:30:50Z",
"image": "quay.io/openshift-release-dev/ocp-release@sha256:b8fa13e09d869089fc5957c32b02b7d3792a0b6f36693432acc0409615ab23b7",
"startedTime": "2021-01-28T17:38:10Z",
"state": "Completed",
"verified": false,
"version": "4.8.13"
}
]
The history contains a list of the most recent versions applied to the cluster.
This value is updated when the CVO applies an update. The list is ordered by
date, where the newest update is first in the list. Updates in the history have
state Completed
if the rollout completed and Partial
if the update failed
or did not complete.
After the update completes, you can confirm that the cluster version has updated to the new version:
$ oc get clusterversion
NAME VERSION AVAILABLE PROGRESSING SINCE STATUS
version 4.9.0 True False 2m Cluster version is 4.9.0
If the
|
If you are upgrading your cluster to the next minor version, like version 4.y to 4.(y+1), it is recommended to confirm your nodes are updated before deploying workloads that rely on a new feature:
$ oc get nodes
NAME STATUS ROLES AGE VERSION
ip-10-0-168-251.ec2.internal Ready master 82m v1.22.1
ip-10-0-170-223.ec2.internal Ready master 82m v1.22.1
ip-10-0-179-95.ec2.internal Ready worker 70m v1.22.1
ip-10-0-182-134.ec2.internal Ready worker 70m v1.22.1
ip-10-0-211-16.ec2.internal Ready master 82m v1.22.1
ip-10-0-250-100.ec2.internal Ready worker 69m v1.22.1
Changing the update server is optional. If you have an OpenShift Update Service (OSUS) installed and configured locally, you must set the URL for the server as the upstream
to use the local server during updates. The default value for upstream
is https://api.openshift.com/api/upgrades_info/v1/graph
.
Change the upstream
parameter value in the cluster version:
$ oc patch clusterversion/version --patch '{"spec":{"upstream":"<update-server-url>"}}' --type=merge
The <update-server-url>
variable specifies the URL for the update server.
clusterversion.config.openshift.io/version patched