The ImageBasedGroupUpgrade CR combines the ImageBasedUpgrade and ClusterGroupUpgrade APIs.
For example, you can define the cluster selection and rollout strategy with the ImageBasedGroupUpgrade API in the same way as the ClusterGroupUpgrade API.
The stage transitions are different from the ImageBasedUpgrade API.
You can use the ImageBasedGroupUpgrade API to combine several stage transitions, also called actions, into one step that share one rollout strategy.
Example 1. Example ImageBasedGroupUpgrade.yaml
apiVersion: lcm.openshift.io/v1alpha1
kind: ImageBasedGroupUpgrade
metadata:
name: <filename>
namespace: default
spec:
clusterLabelSelectors:
- matchExpressions:
- key: name
operator: In
values:
- spoke1
- spoke4
- spoke6
ibuSpec:
seedImageRef:
image: quay.io/seed/image:4.21.0
version: 4.21.0
pullSecretRef:
name: "<seed_pull_secret>"
extraManifests:
- name: example-extra-manifests
namespace: openshift-lifecycle-agent
oadpContent:
- name: oadp-cm
namespace: openshift-adp
plan:
- actions: ["Prep", "Upgrade", "FinalizeUpgrade"]
rolloutStrategy:
maxConcurrency: 200
timeout: 2400
-
clusterLabelSelectors: Clusters to upgrade.
-
seedImageRef: Target platform version, the seed image, and the secret required to access the image.
If you add the seed image pull secret in the hub cluster, in the same namespace as the ImageBasedGroupUpgrade resource, the secret is added to the manifest list for the Prep stage. The secret is recreated in each spoke cluster in the openshift-lifecycle-agent namespace.
-
extraManifests: Optional: Applies additional manifests, which are not in the seed image, to the target cluster. Also applies ConfigMap objects for custom catalog sources.
-
oadpContent: ConfigMap resources that contain the OADP Backup and Restore CRs.
-
plan: Upgrade plan details.
-
maxConcurrency: Number of clusters to update in a batch.
-
timeout: Timeout limit to complete the action in minutes.
Supported action combinations
Actions are the list of stage transitions that TALM completes in the steps of an upgrade plan for the selected group of clusters.
Each action entry in the ImageBasedGroupUpgrade CR is a separate step and a step has one or several actions that share the same rollout strategy.
You can achieve more control over the rollout strategy for each action by separating actions into steps.
You can combine these actions differently in your upgrade plan and you can add the next steps later.
Wait until the earlier steps either complete or fail before adding a step to your plan.
The first action of an added step for clusters that failed a earlier steps must be either Abort or Rollback.
|
|
You cannot remove actions or steps from an ongoing plan.
|
The following table shows example plans for different levels of control over the rollout strategy:
Table 1. Example upgrade plans
| Example plan |
Description |
plan:
- actions: ["Prep", "Upgrade", "FinalizeUpgrade"]
rolloutStrategy:
maxConcurrency: 200
timeout: 60
|
All actions share the same strategy |
plan:
- actions: ["Prep", "Upgrade"]
rolloutStrategy:
maxConcurrency: 200
timeout: 60
- actions: ["FinalizeUpgrade"]
rolloutStrategy:
maxConcurrency: 500
timeout: 10
|
Some actions share the same strategy |
plan:
- actions: ["Prep"]
rolloutStrategy:
maxConcurrency: 200
timeout: 60
- actions: ["Upgrade"]
rolloutStrategy:
maxConcurrency: 200
timeout: 20
- actions: ["FinalizeUpgrade"]
rolloutStrategy:
maxConcurrency: 500
timeout: 10
|
All actions have different strategies |
|
|
Clusters that fail one of the actions will skip the remaining actions in the same step.
|
The ImageBasedGroupUpgrade API accepts the following actions:
Prep
-
Start preparing the upgrade resources by moving to the Prep stage.
Upgrade
-
Start the upgrade by moving to the Upgrade stage.
FinalizeUpgrade
-
Complete the upgrade on selected clusters that completed the Upgrade action by moving to the Idle stage.
Rollback
-
Start a rollback only on successfully upgraded clusters by moving to the Rollback stage.
FinalizeRollback
-
Complete the rollback by moving to the Idle stage.
AbortOnFailure
-
Cancel the upgrade on selected clusters that failed the Prep or Upgrade actions by moving to the Idle stage.
Abort
-
Cancel an ongoing upgrade only on clusters that are not yet upgraded by moving to the Idle stage.
The following action combinations are supported. A pair of brackets signifies one step in the plan section:
-
["Prep"], ["Abort"]
-
["Prep", "Upgrade", "FinalizeUpgrade"]
-
["Prep"], ["AbortOnFailure"], ["Upgrade"], ["AbortOnFailure"], ["FinalizeUpgrade"]
-
["Rollback", "FinalizeRollback"]
Use one of the following combinations when you need to resume or cancel an ongoing upgrade from a completely new ImageBasedGroupUpgrade CR:
Labeling for cluster selection
Use the spec.clusterLabelSelectors field for initial cluster selection.
In addition, TALM labels the managed clusters according to the results of their last stage transition.
When a stage completes or fails, TALM marks the relevant clusters with the following labels:
Use these cluster labels to cancel or roll back an upgrade on a group of clusters after troubleshooting the issues.
|
|
If you are using the ImageBasedGroupUpgrade CR to upgrade your clusters, ensure that you update the lcm.openshift.io/ibgu-<stage>-completed or lcm.openshift.io/ibgu-<stage>-failed cluster labels properly after performing troubleshooting or recovery steps on the managed clusters.
This ensures that the TALM continues to manage the image-based upgrade for the cluster.
|
For example, if you want to cancel the upgrade for all managed clusters except for clusters that successfully completed the upgrade, you can add an Abort action to your plan.
The Abort action moves back the ImageBasedUpgrade CR to the Idle stage, which cancels the upgrade on clusters that are not yet upgraded.
Adding a separate Abort action ensures that the TALM does not perform the Abort action on clusters that have the lcm.openshift.io/ibgu-upgrade-completed label.
The TALM removes the cluster labels after successfully canceling or finalizing the upgrade.
Status monitoring
The ImageBasedGroupUpgrade CR ensures a better monitoring experience by aggregating status reporting for all clusters in one place.
You can monitor the following actions:
status.clusters.completedActions
-
Shows all completed actions defined in the plan section.
status.clusters.currentAction
-
Shows all actions that are currently in progress.
status.clusters.failedActions
-
Shows all failed actions along with a detailed error message.