$ sudo pcs stonith update <node_name>_redfish username=<user_name> password=<password>
Two-node OpenShift cluster with fencing is a Technology Preview feature only. Technology Preview features are not supported with Red Hat production service level agreements (SLAs) and might not be functionally complete. Red Hat does not recommend using them in production. These features provide early access to upcoming product features, enabling customers to test functionality and provide feedback during the development process. For more information about the support scope of Red Hat Technology Preview features, see Technology Preview Features Support Scope. |
Use the following sections help you with recovering from issues in a two-node OpenShift cluster with fencing.
You might need to perform manual recovery steps if a disruption event prevents fencing from functioning correctly. In this case, you can run commands directly on the control plane nodes to recover the cluster. There are four main recovery scenarios, which should be attempted in the following order:
Update fencing secrets: Refresh the Baseboard Management Console (BMC) credentials if they are incorrect or outdated.
Recover from a single-node failure: Restore functionality when only one control plane node is down.
Recover from a complete node failure: Restore functionality when both control plane nodes are down.
Replace a control plane node that cannot be recovered: Replace the node to restore cluster functionality.
You have administrative access to the control plane nodes.
You can connect to the nodes by using SSH.
Do an etcd backup before proceeding to ensure that you can restore the cluster if any issues occur. |
Update the fencing secrets:
If the Cluster API is unavilable, update fencing secret by running the following command on one of the cluster nodes:
$ sudo pcs stonith update <node_name>_redfish username=<user_name> password=<password>
After the Cluster API recovers, or the Cluster API is already available, update fencing secret in the cluster to ensure it stays in sync, as described in the following step.
Edit the username and password for the existing fencing secret for the control plane node by running the following commads:
$ oc project openshift-etcd
$ oc edit secret <node_name>-fencing
If the cluster recovers after updating the fencing secrets, no further action is required. If the issue persists, proceed to the next step.
Recover from a single-node failure:
Gather initial diagnostics by running the following command:
$ sudo pcs status --full
This command provides a detailed view of the current cluster and resource states. You can use the output to identify issues with fencing or etcd startup.
Run the following additional diagnostic commands, if necessary:
Reset the resources on your cluster and instruct Pacemaker to attempt to start them fresh by running the following command:
$ sudo pcs resource cleanup
Review all Pacemaker activity on the node by running the following command:
$ sudo journalctl -u pacemaker
Diagnose etcd resource startup issues by running the following command:
$ sudo journalctl -u pacemaker | grep podman-etcd
View the fencing configuration for the node by running the following command:
$ sudo pcs stonith config <node_name>_redfish
If fencing is required but is not functioning, ensure that the Redfish fencing endpoint is accessible and verify that the credentials are correct.
If etcd is not starting despite fencing being operational, restore etcd from a backup by running the following commands:
$ sudo cp -r /var/lib/etcd-backup/* /var/lib/etcd/
$ sudo chown -R etcd:etcd /var/lib/etcd
If the recovery is successful, no further action is required. If the issue persists, proceed to the next step.
Recover from a complete node failure:
Power on both control plane nodes.
Pacemaker starts automatically and begins the recovery operation when it detects both nodes are online. If the recovery does not start as expected, use the diagnostic commands described in the previous step to investigate the issue.
Reset the resources on your cluster and instruct Pacemaker to attempt to start them fresh by running the following command:
$ sudo pcs resource cleanup
Check resource start order by running the following command:
$ sudo pcs status --full
Inspect the pacemaker service journal if kubelet fails by running the following commands:
$ sudo journalctl -u pacemaker
$ sudo journalctl -u kubelet
Handle out-of-sync etcd.
If one node has a more up-to-date etcd, Pacemaker attempts to fence the lagging node and start it as a learner. If this process stalls, verify the Redfish fencing endpoint and credentials by running the following command:
$ sudo pcs stonith config
If the recovery is successful, no further action is required. If the issue persists, perform manual recovery as described in the next step.
If you need to manually recover from an event when one of the nodes is not recoverable, follow the procedure in "Replacing control plane nodes in a two-node OpenShift cluster".
When a cluster loses a single node, it enters the degraded mode. In this state, Pacemaker automatically unblocks quorum and allows the cluster to temporarily operate on the remaining node.
If both nodes fail, you must restart both nodes to reestablish quorum so that Pacemaker can resume normal cluster operations.
If only one of the two nodes can be restarted, follow the node replacement procedure to manually reestablish quorum on the surviving node.
If manual recovery is still required and it fails, collect a must-gather and SOS report, and file a bug.
For information about verifying that both control plane nodes and etcd are operating correctly, see "Verifying etcd health in a two-node OpenShift cluster with fencing".
You can replace a failed control plane node in a two-node OpenShift cluster. The replacement node must use the same host name and IP address as the failed node.
You have a functioning survivor control plane node.
You have verified that either the machine is not running or the node is not ready.
You have access to the cluster as a user with the cluster-admin
role.
You know the host name and IP address of the failed node.
Do an etcd backup before proceeding to ensure that you can restore the cluster if any issues occur. |
Check the quorum state by running the following command:
$ sudo pcs quorum status
Quorum information
------------------
Date: Fri Oct 3 14:15:31 2025
Quorum provider: corosync_votequorum
Nodes: 2
Node ID: 1
Ring ID: 1.16
Quorate: Yes
Votequorum information
----------------------
Expected votes: 2
Highest expected: 2
Total votes: 2
Quorum: 1
Flags: 2Node Quorate WaitForAll
Membership information
----------------------
Nodeid Votes Qdevice Name
1 1 NR master-0 (local)
2 1 NR master-1
If quorum is lost and one control plane node is still running, restore quorum manually on the survivor node by running the following command:
$ sudo pcs quorum unblock
If only one node failed, verify that etcd is running on the survivor node by running the following command:
$ sudo pcs resource status etcd
If etcd is not running, restart etcd by running the following command:
$ sudo pcs resource cleanup etcd
If etcd still does not start, force it manually on the survivor node, skipping fencing:
Before running this commands, ensure that the node being replaced is inaccessible. Otherwise, you risk etcd corruption. |
$ sudo pcs resource debug-stop etcd
$ sudo OCF_RESKEY_CRM_meta_notify_start_resource='etcd' pcs resource debug-start etcd
After recovery, etcd must be running successfully on the survivor node.
Delete etcd secrets for the failed node by running the following commands:
$ oc project openshift-etcd
$ oc delete secret etcd-peer-<node_name>
$ oc delete secret etcd-serving-<node_name>
$ oc delete secret etcd-serving-metrics-<node_name>
To replace the failed node, you must delete its etcd secrets first. When etcd is running, it might take some time for the API server to respond to these commands. |
Delete resources for the failed node:
If you have the BareMetalHost
(BMH) objects, list them to identify the host you are replacing by running the following command:
$ oc get bmh -n openshift-machine-api
Delete the BMH object for the failed node by running the following command:
$ oc delete bmh/<bmh_name> -n openshift-machine-api
List the Machine
objects to identify the object that maps to the node that you are replacing by running the following command:
$ oc get machines.machine.openshift.io -n openshift-machine-api
Get the label with the machine hash value from the Machine
object by running the following command:
$ oc get machines.machine.openshift.io/<machine_name> -n openshift-machine-api \
-o jsonpath='Machine hash label: {.metadata.labels.machine\.openshift\.io/cluster-api-cluster}{"\n"}'
Replace <machine_name>
with the name of a Machine
object in your cluster. For example, ostest-bfs7w-ctrlplane-0
.
You need this label to provision a new Machine
object.
Delete the Machine
object for the failed node by running the following command:
$ oc delete machines.machine.openshift.io/<machine_name>-<failed nodename> -n openshift-machine-api
The node object is deleted automatically after deleting the |
Recreate the failed host by using the same name and IP address:
You must perform this step only if you are using installer-provisioned infrastructure or the Machine API to create the original node. For information about replacing a failed bare-metal control plane node, see "Replacing an unhealthy etcd member on bare metal". |
Remove the BMH and Machine
objects. The machine controller automatically deletes the node object.
Provision a new machine by using the following sample configuration:
Machine
object configurationapiVersion: machine.openshift.io/v1beta1
kind: Machine
metadata:
annotations:
metal3.io/BareMetalHost: openshift-machine-api/{bmh_name}
finalizers:
- machine.machine.openshift.io
labels:
machine.openshift.io/cluster-api-cluster: {machine_hash_label}
machine.openshift.io/cluster-api-machine-role: master
machine.openshift.io/cluster-api-machine-type: master
name: {machine_name}
namespace: openshift-machine-api
spec:
authoritativeAPI: MachineAPI
metadata: {}
providerSpec:
value:
apiVersion: baremetal.cluster.k8s.io/v1alpha1
customDeploy:
method: install_coreos
hostSelector: {}
image:
checksum: ""
url: ""
kind: BareMetalMachineProviderSpec
metadata:
creationTimestamp: null
userData:
name: master-user-data-managed
metadata.annotations.metal3.io/BareMetalHost
: Replace {bmh_name}
with the name of the BMH object that is associated with the host that you are replacing.
labels.machine.openshift.io/cluster-api-cluster
: Replace {machine_hash_label}
with the label that you fetched from the machine you deleted.
metadata.name
: Replace {machine_name}
with the name of the machine you deleted.
Create the new BMH object and the secret to store the BMC credentials by running the following command:
cat <<EOF | oc apply -f -
apiVersion: v1
kind: Secret
metadata:
name: <secret_name>
namespace: openshift-machine-api
data:
password: <password>
username: <username>
type: Opaque
---
apiVersion: metal3.io/v1alpha1
kind: BareMetalHost
metadata:
name: {bmh_name}
namespace: openshift-machine-api
spec:
automatedCleaningMode: disabled
bmc:
address: <redfish_url>/{uuid}
credentialsName: <name>
disableCertificateVerification: true
bootMACAddress: {boot_mac_address}
bootMode: UEFI
externallyProvisioned: false
online: true
rootDeviceHints:
deviceName: /dev/disk/by-id/scsi-<serial_number>
userData:
name: master-user-data-managed
namespace: openshift-machine-api
EOF
metadata.name
: Specify the name of the secret.
metadata.name
: Replace {bmh_name}
with the name of the BMH object that you deleted.
bmc.address
: Replace {uuid}
with the UUID of the node that you created.
bmc.credentialsName
: Replace name
with the name of the secret that you created.
bootMACAddress
: Specify the MAC address of the provisioning network interface. This is the MAC address the node uses to identify itself when communicating with Ironic during provisioning.
Verify that the new node has reached the Provisioned
state by running the following command:
$ oc get bmh -o wide
The value of the STATUS
column in the output of this command must be Provisioned
.
The provisioning process can take 10 to 20 minutes to complete. |
Verify that both control plane nodes are in the Ready
state by running the following command:
$ oc get nodes
The value of the STATUS
column in the output of this command must be Ready
for both nodes.
Apply the detached
annotation to the BMH object to prevent the Machine API from managing it by running the following command:
$ oc annotate bmh <bmh_name> -n openshift-machine-api baremetalhost.metal3.io/detached='' --overwrite
Rejoin the replacement node to the pacemaker cluster by running the following command:
Run the following command on the survivor control plane node, not the node being replaced. |
$ sudo pcs cluster node remove <node_name>
$ sudo pcs cluster node add <node_name> addr=<node_ip> --start --enable
Delete stale jobs for the failed node by running the following command:
$ oc project openshift-etcd
$ oc delete job tnf-auth-job-<node_name>
$ oc delete job tnf-after-setup-job-<node_name>
For information about verifying that both control plane nodes and etcd are operating correctly, see "Verifying etcd health in a two-node OpenShift cluster with fencing".
After completing node recovery or maintenance procedures, verify that both control plane nodes and etcd are operating correctly.
You have access to the cluster as a user with cluster-admin
privileges.
You can access at least one control plane node through SSH.
Check the overall node status by running the following command:
$ oc get nodes
This command verifies that both control plane nodes are in the Ready
state, indicating that they can receive workloads for scheduling.
Verify the status of the cluster-etcd-operator
by running the following command:
$ oc describe co/etcd
The cluster-etcd-operator
manages and reports on the health of your etcd setup. Reviewing its status helps you identify any ongoing issues or degraded conditions.
Review the etcd member list by running the following command:
$ oc rsh -n openshift-etcd <etcd_pod> etcdctl member list -w table
This command shows the current etcd members and their roles. Look for any nodes marked as learner
, which indicates that they are in the process of becoming voting members.
Review the Pacemaker resource status by running the following command on either control plane node:
$ sudo pcs status --full
This command provides a detailed overview of all resources managed by Pacemaker. You must ensure that the following conditions are met:
Both nodes are online.
The kubelet
and etcd
resources are running.
Fencing is correctly configured for both nodes.