Recovering a failing etcd cluster - High availability for hosted control planes | Hosted control planes

About
- Welcome
- Learn more about OpenShift Container Platform
What's new?
- New features and enhancements
- Deprecated features
Architecture
- Architecture overview
- Product architecture
- Installation and update
- Control plane architecture
- Understanding OKD development
- Fedora CoreOS
Installing
- Installation overview
- Disconnected installation mirroring
- Installing on Alibaba
- Installing on AWS
- Installing on Azure
- Installing on Azure Stack Hub
- Installing on Google Cloud
- Installing on IBM Cloud
- Installing on Nutanix
- Installing on bare metal
- Installing on a single node
  - Preparing to install OpenShift on a single node
  - Installing OpenShift on a single node
- Deploying installer-provisioned clusters on bare metal
- Installing IBM Cloud Bare Metal (Classic)
  - Prerequisites
  - Installation workflow
- Installing on OpenStack
- Installing on OCI
  - Installing a cluster on Oracle Cloud Infrastructure by using the Assisted Installer
  - Installing a cluster on Oracle Cloud Infrastructure by using the Agent-based Installer
- Installing on vSphere
- Installing on any platform
  - Installing a cluster on any platform
- Installation configuration
  - Customizing nodes
  - Configuring your firewall
- Validation and troubleshooting
  - Validating an installation
  - Troubleshooting installation issues
Postinstallation configuration
- Configuring a private cluster
- Bare metal configuration
- Machine configuration tasks
- Cluster tasks
- Node tasks
- Postinstallation network configuration
- Configuring image streams and image registries
- Storage configuration
- Preparing for users
- Changing the cloud provider credentials configuration
- Configuring alert notifications
- Converting a connected cluster to a disconnected cluster
- Enabling cluster capabilities
- Configuring additional devices in an IBM Z or IBM LinuxONE environment
- Fedora CoreOS (FCOS) image layering
Updating clusters
- Updating clusters overview
- Understanding OpenShift updates
  - Introduction to OpenShift updates
  - Understanding OpenShift update duration
- Preparing to update a cluster
- Performing a cluster update
- Troubleshooting a cluster update
  - Gathering data about your cluster update
Support
- Support overview
- Managing your cluster resources
- Getting support
- Remote health monitoring with connected clusters
- Gathering data about your cluster
- Summarizing cluster specifications
- Troubleshooting
Web console
- Web console overview
- Accessing the web console
- Using the OpenShift Container Platform dashboard to get cluster information
- Adding user preferences
- Configuring the web console
- Customizing the web console
- Dynamic plugins
- Disabling the web console
- Creating quick start tutorials
CLI tools
- CLI tools overview
- OpenShift CLI (oc)
- Developer CLI (odo)
- Knative CLI (kn) for use with OpenShift Serverless
- opm CLI
  - Installing the opm CLI
  - opm CLI reference
- Operator SDK
  - Installing the Operator SDK CLI
  - Operator SDK CLI reference
Security and compliance
- Security and compliance overview
- Container security
- Configuring certificates
- Certificate types and descriptions
- Compliance Operator
- File Integrity Operator
- Security Profiles Operator
- Viewing audit logs
- Configuring the audit log policy
- Configuring TLS security profiles
- Configuring seccomp profiles
- Allowing JavaScript-based access to the API server from additional hosts
- Encrypting etcd data
- Scanning pods for vulnerabilities
- Network-Bound Disk Encryption (NBDE)
Authentication and authorization
- Authentication and authorization overview
- Understanding authentication
- Configuring the internal OAuth server
- Configuring OAuth clients
- Managing user-owned OAuth access tokens
- Understanding identity provider configuration
- Configuring identity providers
- Using RBAC to define and apply permissions
- Removing the kubeadmin user
- Understanding and creating service accounts
- Using service accounts in applications
- Using a service account as an OAuth client
- Scoping tokens
- Using bound service account tokens
- Managing security context constraints
- Understanding and managing pod security admission
- Impersonating the system:admin user
- Syncing LDAP groups
- Managing cloud provider credentials
Networking
- About networking
- Understanding networking
- Accessing hosts
- Networking Operators overview
- Understanding the Cluster Network Operator
- Understanding the DNS Operator
- Understanding the Ingress Operator
- Understanding the Ingress Node Firewall Operator
- Configuring the Ingress Controller for manual DNS management
- Verifying connectivity to an endpoint
- Changing the cluster network MTU
- Configuring the node port service range
- Configuring the cluster network IP address range
- Configuring IP failover
- Configuring system controls and interface attributes using the tuning plugin
- Using Stream Control Transmission Protocol
- Using Precision Time Protocol hardware
  - About Precision Time Protocol in OpenShift cluster nodes
  - Configuring PTP devices
  - Using Precision Time Protocol events
  - Developing Precision Time Protocol events consumer applications
- External DNS Operator
  - External DNS Operator release notes
  - Understanding the External DNS Operator
  - Installing the External DNS Operator
  - External DNS Operator configuration parameters
  - Creating DNS records on a public hosted zone for AWS
  - Creating DNS records on a public zone for Azure
  - Creating DNS records on a public managed zone for Google Cloud
  - Creating DNS records on a public DNS zone for Infoblox
  - Configuring the cluster-wide proxy on the External DNS Operator
- Network policy
  - About network policy
  - Creating a network policy
  - Viewing a network policy
  - Editing a network policy
  - Deleting a network policy
  - Defining a default network policy for projects
  - Configuring multitenant isolation with network policy
- CIDR range definitions
- AWS Load Balancer Operator
  - AWS Load Balancer Operator release notes
  - Understanding the AWS Load Balancer Operator
  - Installing the AWS Load Balancer Operator
  - Preparing for the AWS Load Balancer Operator on a cluster using the AWS Security Token Service (STS)
  - Creating an instance of the AWS Load Balancer Controller
  - Serving multiple ingress resources through a single AWS Load Balancer
  - Adding TLS termination on the AWS Load Balancer
  - Configuring cluster-wide proxy on the AWS Load Balancer Operator
- Multiple networks
  - Understanding multiple networks
  - Configuring an additional network
  - About virtual routing and forwarding
  - Configuring multi-network policy
  - Attaching a pod to an additional network
  - Removing a pod from an additional network
  - Editing an additional network
  - Removing an additional network
  - Assigning a secondary network to a VRF
- Hardware networks
  - About Single Root I/O Virtualization (SR-IOV) hardware networks
  - Installing the SR-IOV Operator
  - Configuring the SR-IOV Operator
  - Configuring an SR-IOV network device
  - Configuring an SR-IOV Ethernet network attachment
  - Configuring an SR-IOV InfiniBand network attachment
  - Adding a pod to an SR-IOV network
  - Configuring interface-level network sysctl settings and all-multicast mode for SR-IOV networks
  - Using high performance multicast
  - Using DPDK and RDMA
  - Using pod-level bonding for secondary networks
  - Configuring hardware offloading
  - Switching Bluefield-2 from NIC to DPU mode
  - Uninstalling the SR-IOV Operator
- OVN-Kubernetes network plugin
  - About the OVN-Kubernetes network plugin
  - OVN-Kubernetes architecture
  - OVN-Kubernetes troubleshooting
  - OVN-Kubernetes network policy
  - OVN-Kubernetes traffic tracing
  - Migrating from the OpenShift SDN network plugin
  - Rolling back to the OpenShift SDN network plugin
  - Migrating from Kuryr
  - Converting to IPv4/IPv6 dual stack networking
  - Configuring internal subnets
  - Logging for egress firewall and network policy rules
  - Configuring IPsec encryption
  - Configure an external gateway on the default network
  - Configuring an egress firewall for a project
  - Viewing an egress firewall for a project
  - Editing an egress firewall for a project
  - Removing an egress firewall from a project
  - Configuring an egress IP address
  - Assigning an egress IP address
  - Configuring an egress service
  - Considerations for the use of an egress router pod
  - Deploying an egress router pod in redirect mode
  - Enabling multicast for a project
  - Disabling multicast for a project
  - Tracking network flows
  - Configuring hybrid networking
- OpenShift SDN network plugin
  - About the OpenShift SDN network plugin
  - Migrating to the OpenShift SDN network plugin
  - Rolling back to the OVN-Kubernetes network plugin
  - Configuring egress IPs for a project
  - Configuring an egress firewall for a project
  - Viewing an egress firewall for a project
  - Editing an egress firewall for a project
  - Removing an egress firewall from a project
  - Considerations for the use of an egress router pod
  - Deploying an egress router pod in redirect mode
  - Deploying an egress router pod in HTTP proxy mode
  - Deploying an egress router pod in DNS proxy mode
  - Configuring an egress router pod destination list from a config map
  - Enabling multicast for a project
  - Disabling multicast for a project
  - Configuring multitenant isolation
  - Configuring kube-proxy
- Configuring Routes
  - Route configuration
  - Secured routes
- Configuring ingress cluster traffic
  - Overview
  - Configuring ExternalIPs for services
  - Configuring ingress cluster traffic using an Ingress Controller
  - Configuring the Ingress Controller endpoint publishing strategy
  - Configuring ingress cluster traffic using a load balancer
  - Configuring ingress cluster traffic on AWS
  - Configuring ingress cluster traffic using a service external IP
  - Configuring ingress cluster traffic using a NodePort
  - Configuring ingress cluster traffic using load balancer allowed source ranges
  - Patching existing ingress objects
- Kubernetes NMState
  - About the Kubernetes NMState Operator
  - Observing and updating node network state and configuration
  - Troubleshooting node network configuration
- Configuring the cluster-wide proxy
- Configuring a custom PKI
- Load balancing on OpenStack
- Load balancing with MetalLB
  - About MetalLB and the MetalLB Operator
  - Installing the MetalLB Operator
  - Upgrading the MetalLB Operator
  - Configuring MetalLB address pools
  - Advertising the IP address pools
  - Configuring MetalLB BGP peers
  - Advertising an IP address pool using the community alias
  - Configuring MetalLB BFD profiles
  - Configuring services to use MetalLB
  - Managing symmetric routing with MetalLB
  - MetalLB logging, troubleshooting, and support
- Associating secondary interfaces metrics to network attachments
Storage
- Storage overview
- Understanding ephemeral storage
- Understanding persistent storage
- Configuring persistent storage
- Persistent storage using local storage
- Using Container Storage Interface (CSI)
- Generic ephemeral volumes
- Expanding persistent volumes
- Dynamic provisioning
- Detach volumes after non-graceful node shutdown
Registry
- Registry overview
- Image Registry Operator in OKD
- Setting up and configuring the registry
- Accessing the registry
- Exposing the registry
Operators
- Operators overview
- Understanding Operators
- User tasks
  - Creating applications from installed Operators
  - Installing Operators in your namespace
- Administrator tasks
- Developing Operators
- Cluster Operators reference
- OLM 1.0 (Technology Preview)
CI/CD
- CI/CD overview
  - About CI/CD
- Builds using BuildConfig
Images
- Overview of images
- Configuring the Cluster Samples Operator
- Using the Cluster Samples Operator with an alternate registry
- Creating images
- Managing images
- Managing image streams
- Using image streams with Kubernetes resources
- Triggering updates on image stream changes
- Image configuration resources
- Using images
Building applications
- Building applications overview
- Projects
- Creating applications
- Viewing application composition by using the Topology view
- Exporting applications
- Connecting applications to services
- Working with Helm charts
- Deployments
- Quotas
  - Resource quotas per project
  - Resource quotas across multiple projects
- Using config maps with applications
- Monitoring project and application metrics using the Developer perspective
- Monitoring application health
- Editing applications
- Pruning objects to reclaim resources
- Idling applications
- Deleting applications
- Using the Red Hat Marketplace
Machine management
- Overview of machine management
- Managing compute machines with the Machine API
- Manually scaling a compute machine set
- Modifying a compute machine set
- Machine phases and lifecycle
- Deleting a machine
- Applying autoscaling to a cluster
- Creating infrastructure machine sets
- Managing user-provisioned infrastructure manually
- Managing control plane machines
- Managing machines with the Cluster API
- Deploying machine health checks
Hosted control planes
- Hosted control planes overview
- Getting started with hosted control planes
- Authentication and authorization for hosted control planes
- Handling a machine configuration for hosted control planes
- Using feature gates in a hosted cluster
- Updating hosted control planes
- Hosted control planes Observability
- High availability for hosted control planes
- Troubleshooting hosted control planes
Nodes
- Overview of nodes
- Working with pods
- Automatically scaling pods with the Custom Metrics Autoscaler Operator
- Controlling pod placement onto nodes (scheduling)
- Using Jobs and DaemonSets
  - Running background tasks on nodes automatically with daemonsets
  - Running tasks in pods using jobs
- Working with nodes
- Working with containers
- Working with clusters
- Node metrics dashboard
Windows Container Support for OpenShift
- Red Hat OpenShift support for Windows Containers overview
- Release notes
- Understanding Windows container workloads
- Enabling Windows container workloads
- Creating Windows machine sets
- Scheduling Windows container workloads
- Windows node upgrades
- Using Bring-Your-Own-Host Windows instances as nodes
- Removing Windows nodes
- Disabling Windows container workloads
Observability
- Observability overview
  - About Observability
- Cluster Observability Operator
  - Cluster Observability Operator overview
- Monitoring
- Logging
- Network Observability
- Power Monitoring
Scalability and performance
- Scalability and performance overview
- Recommended performance and scalability practices
  - Recommended control plane practices
  - Recommended infrastructure practices
  - Recommended etcd practices
- Reference design specifications
  - Telco reference design specifications
  - Telco RAN DU reference design specification
  - Telco core reference design specification
- Planning your environment according to object maximums
- Compute Resource Quotas
- Using the Node Tuning Operator
- Using CPU Manager and Topology Manager
- Scheduling NUMA-aware workloads
- Scalability and performance optimization
  - Optimizing storage
  - Optimizing routing
  - Optimizing networking
  - Optimizing CPU usage
- Managing bare metal hosts
- Monitoring bare-metal events
- What huge pages do and how they are consumed by apps
- Understanding low latency
- Tuning nodes for low latency with the performance profile
- Provisioning real-time and low latency workloads
- Debugging low latency tuning
- Performing latency tests for platform verification
- Improving cluster stability in high latency environments using worker latency profiles
- Workload partitioning
- Requesting CRI-O and Kubelet profiling data by using the Node Observability Operator
- Clusters at the network far edge
  - Challenges of the network far edge
  - Preparing the hub cluster for ZTP
  - Updating GitOps ZTP
  - Installing managed clusters with RHACM and SiteConfig resources
  - Configuring managed clusters with policies and PolicyGenTemplate resources
  - Manually installing a single-node OpenShift cluster with ZTP
  - Recommended single-node OpenShift cluster configuration for vDU application workloads
  - Validating cluster tuning for vDU application workloads
  - Advanced managed cluster configuration with SiteConfig resources
  - Advanced managed cluster configuration with PolicyGenTemplate resources
  - Updating managed clusters with the Topology Aware Lifecycle Manager
  - Updating managed clusters in a disconnected environment with the Topology Aware Lifecycle Manager
  - Expanding single-node OpenShift clusters with GitOps ZTP
  - Pre-caching images for single-node OpenShift deployments
Specialized hardware and driver enablement
- About specialized hardware and driver enablement
- Driver Toolkit
- Node Feature Discovery Operator
- Kernel Module Management Operator
Backup and restore
- Overview of backup and restore operations
- Shutting down a cluster gracefully
- Restarting a cluster gracefully
- OADP Application backup and restore
- Control plane backup and restore
Migrating from version 3 to 4
- Migrating from version 3 to 4 overview
- About migrating from OKD 3 to 4
- Differences between OKD 3 and 4
- Network considerations
- About MTC
- Installing MTC
- Installing MTC in a restricted network environment
- Upgrading MTC
- Premigration checklists
- Migrating your applications
- Advanced migration options
- Troubleshooting
Migration Toolkit for Containers
- About MTC
- MTC release notes
- Installing MTC
- Installing MTC in a restricted network environment
- Upgrading MTC
- Premigration checklists
- Network considerations
- Direct Migration Requirements
- Migrating your applications
- Advanced migration options
- Migrating virtual machine storage
- Troubleshooting
API reference
- API overview
  - Understanding API tiers
  - API compatibility guidelines
  - Editing kubelet log level verbosity and gathering logs
  - API list
- Common object reference
  - Index
- Authorization APIs
  - About Authorization APIs
  - LocalResourceAccessReview [authorization.openshift.io/v1]
  - LocalSubjectAccessReview [authorization.openshift.io/v1]
  - ResourceAccessReview [authorization.openshift.io/v1]
  - SelfSubjectRulesReview [authorization.openshift.io/v1]
  - SubjectAccessReview [authorization.openshift.io/v1]
  - SubjectRulesReview [authorization.openshift.io/v1]
  - TokenRequest [authentication.k8s.io/v1]
  - TokenReview [authentication.k8s.io/v1]
  - LocalSubjectAccessReview [authorization.k8s.io/v1]
  - SelfSubjectAccessReview [authorization.k8s.io/v1]
  - SelfSubjectRulesReview [authorization.k8s.io/v1]
  - SubjectAccessReview [authorization.k8s.io/v1]
- Autoscale APIs
  - About Autoscale APIs
  - ClusterAutoscaler [autoscaling.openshift.io/v1]
  - MachineAutoscaler [autoscaling.openshift.io/v1beta1]
  - HorizontalPodAutoscaler [autoscaling/v2]
  - Scale [autoscaling/v1]
- Config APIs
  - About Config APIs
  - APIServer [config.openshift.io/v1]
  - Authentication [config.openshift.io/v1]
  - Build [config.openshift.io/v1]
  - ClusterOperator [config.openshift.io/v1]
  - ClusterVersion [config.openshift.io/v1]
  - Console [config.openshift.io/v1]
  - DNS [config.openshift.io/v1]
  - FeatureGate [config.openshift.io/v1]
  - HelmChartRepository [helm.openshift.io/v1beta1]
  - Image [config.openshift.io/v1]
  - ImageDigestMirrorSet [config.openshift.io/v1]
  - ImageContentPolicy [config.openshift.io/v1]
  - ImageTagMirrorSet [config.openshift.io/v1]
  - Infrastructure [config.openshift.io/v1]
  - Ingress [config.openshift.io/v1]
  - Network [config.openshift.io/v1]
  - Node [config.openshift.io/v1]
  - OAuth [config.openshift.io/v1]
  - OperatorHub [config.openshift.io/v1]
  - Project [config.openshift.io/v1]
  - ProjectHelmChartRepository [helm.openshift.io/v1beta1]
  - Proxy [config.openshift.io/v1]
  - Scheduler [config.openshift.io/v1]
- Console APIs
  - About Console APIs
  - ConsoleCLIDownload [console.openshift.io/v1]
  - ConsoleExternalLogLink [console.openshift.io/v1]
  - ConsoleLink [console.openshift.io/v1]
  - ConsoleNotification [console.openshift.io/v1]
  - ConsolePlugin [console.openshift.io/v1]
  - ConsoleQuickStart [console.openshift.io/v1]
  - ConsoleYAMLSample [console.openshift.io/v1]
- Extension APIs
  - About Extension APIs
  - APIService [apiregistration.k8s.io/v1]
  - CustomResourceDefinition [apiextensions.k8s.io/v1]
  - MutatingWebhookConfiguration [admissionregistration.k8s.io/v1]
  - ValidatingWebhookConfiguration [admissionregistration.k8s.io/v1]
- Image APIs
  - About Image APIs
  - Image [image.openshift.io/v1]
  - ImageSignature [image.openshift.io/v1]
  - ImageStreamImage [image.openshift.io/v1]
  - ImageStreamImport [image.openshift.io/v1]
  - ImageStreamLayers [image.openshift.io/v1]
  - ImageStreamMapping [image.openshift.io/v1]
  - ImageStream [image.openshift.io/v1]
  - ImageStreamTag [image.openshift.io/v1]
  - ImageTag [image.openshift.io/v1]
  - SecretList [image.openshift.io/v1]
- Machine APIs
  - About Machine APIs
  - ContainerRuntimeConfig [machineconfiguration.openshift.io/v1]
  - ControllerConfig [machineconfiguration.openshift.io/v1]
  - ControlPlaneMachineSet [machine.openshift.io/v1]
  - KubeletConfig [machineconfiguration.openshift.io/v1]
  - MachineConfigPool [machineconfiguration.openshift.io/v1]
  - MachineConfig [machineconfiguration.openshift.io/v1]
  - MachineHealthCheck [machine.openshift.io/v1beta1]
  - Machine [machine.openshift.io/v1beta1]
  - MachineSet [machine.openshift.io/v1beta1]
- Metadata APIs
  - About Metadata APIs
  - APIRequestCount [apiserver.openshift.io/v1]
  - Binding [undefined/v1]
  - ComponentStatus [undefined/v1]
  - ConfigMap [undefined/v1]
  - ControllerRevision [apps/v1]
  - Event [events.k8s.io/v1]
  - Event [undefined/v1]
  - Lease [coordination.k8s.io/v1]
  - Namespace [undefined/v1]
- Monitoring APIs
  - About Monitoring APIs
  - AlertingRule [monitoring.openshift.io/v1]
  - Alertmanager [monitoring.coreos.com/v1]
  - AlertmanagerConfig [monitoring.coreos.com/v1beta1]
  - AlertRelabelConfig [monitoring.openshift.io/v1
  - PodMonitor [monitoring.coreos.com/v1]
  - Probe [monitoring.coreos.com/v1]
  - Prometheus [monitoring.coreos.com/v1]
  - PrometheusRule [monitoring.coreos.com/v1]
  - ServiceMonitor [monitoring.coreos.com/v1]
  - ThanosRuler [monitoring.coreos.com/v1]
- Network APIs
  - About Network APIs
  - AdminPolicyBasedExternalRoute [k8s.ovn.org/v1]
  - CloudPrivateIPConfig [cloud.network.openshift.io/v1]
  - EgressFirewall [k8s.ovn.org/v1]
  - EgressIP [k8s.ovn.org/v1]
  - EgressQoS [k8s.ovn.org/v1]
  - Endpoints [undefined/v1]
  - EndpointSlice [discovery.k8s.io/v1]
  - EgressRouter [network.operator.openshift.io/v1]
  - Ingress [networking.k8s.io/v1]
  - IngressClass [networking.k8s.io/v1]
  - IPPool [whereabouts.cni.cncf.io/v1alpha1]
  - NetworkAttachmentDefinition [k8s.cni.cncf.io/v1]
  - NetworkPolicy [networking.k8s.io/v1]
  - OverlappingRangeIPReservation [whereabouts.cni.cncf.io/v1alpha1]
  - PodNetworkConnectivityCheck [controlplane.operator.openshift.io/v1alpha1]
  - Route [route.openshift.io/v1]
  - Service [undefined/v1]
- Node APIs
  - About Node APIs
  - Node [undefined/v1]
  - PerformanceProfile [performance.openshift.io/v2]
  - Profile [tuned.openshift.io/v1]
  - RuntimeClass [node.k8s.io/v1]
  - Tuned [tuned.openshift.io/v1]
- OAuth APIs
  - About OAuth APIs
  - OAuthAccessToken [oauth.openshift.io/v1]
  - OAuthAuthorizeToken [oauth.openshift.io/v1]
  - OAuthClientAuthorization [oauth.openshift.io/v1]
  - OAuthClient [oauth.openshift.io/v1]
  - UserOAuthAccessToken [oauth.openshift.io/v1]
- Operator APIs
  - About Operator APIs
  - Authentication [operator.openshift.io/v1]
  - CloudCredential [operator.openshift.io/v1]
  - ClusterCSIDriver [operator.openshift.io/v1]
  - Console [operator.openshift.io/v1]
  - Config [operator.openshift.io/v1]
  - Config [imageregistry.operator.openshift.io/v1]
  - Config [samples.operator.openshift.io/v1]
  - CSISnapshotController [operator.openshift.io/v1]
  - DNS [operator.openshift.io/v1]
  - DNSRecord [ingress.operator.openshift.io/v1]
  - Etcd [operator.openshift.io/v1]
  - ImageContentSourcePolicy [operator.openshift.io/v1alpha1]
  - ImagePruner [imageregistry.operator.openshift.io/v1]
  - IngressController [operator.openshift.io/v1]
  - InsightsOperator [operator.openshift.io/v1]
  - KubeAPIServer [operator.openshift.io/v1]
  - KubeControllerManager [operator.openshift.io/v1]
  - KubeScheduler [operator.openshift.io/v1]
  - KubeStorageVersionMigrator [operator.openshift.io/v1]
  - Network [operator.openshift.io/v1]
  - OpenShiftAPIServer [operator.openshift.io/v1]
  - OpenShiftControllerManager [operator.openshift.io/v1]
  - OperatorPKI [network.operator.openshift.io/v1]
  - ServiceCA [operator.openshift.io/v1]
  - Storage [operator.openshift.io/v1]
- OperatorHub APIs
  - About OperatorHub APIs
  - CatalogSource [operators.coreos.com/v1alpha1]
  - ClusterServiceVersion [operators.coreos.com/v1alpha1]
  - InstallPlan [operators.coreos.com/v1alpha1]
  - OLMConfig [operators.coreos.com/v1]
  - Operator [operators.coreos.com/v1]
  - OperatorCondition [operators.coreos.com/v2]
  - OperatorGroup [operators.coreos.com/v1]
  - PackageManifest [packages.operators.coreos.com/v1]
  - Subscription [operators.coreos.com/v1alpha1]
- Policy APIs
  - About Policy APIs
  - Eviction [policy/v1]
  - PodDisruptionBudget [policy/v1]
- Project APIs
  - About Project APIs
  - Project [project.openshift.io/v1]
  - ProjectRequest [project.openshift.io/v1]
- Provisioning APIs
  - About Provisioning APIs
  - BMCEventSubscription [metal3.io/v1alpha1]
  - BareMetalHost [metal3.io/v1alpha1]
  - FirmwareSchema [metal3.io/v1alpha1]
  - HardwareData [metal3.io/v1alpha1]
  - HostFirmwareSettings [metal3.io/v1alpha1]
  - Metal3Remediation [infrastructure.cluster.x-k8s.io/v1beta1]
  - Metal3RemediationTemplate [infrastructure.cluster.x-k8s.io/v1beta1]
  - PreprovisioningImage [metal3.io/v1alpha1]
  - Provisioning [metal3.io/v1alpha1]
- RBAC APIs
  - About RBAC APIs
  - ClusterRoleBinding [rbac.authorization.k8s.io/v1]
  - ClusterRole [rbac.authorization.k8s.io/v1]
  - RoleBinding [rbac.authorization.k8s.io/v1]
  - Role [rbac.authorization.k8s.io/v1]
- Role APIs
  - About Role APIs
  - ClusterRoleBinding [authorization.openshift.io/v1]
  - ClusterRole [authorization.openshift.io/v1]
  - RoleBindingRestriction [authorization.openshift.io/v1]
  - RoleBinding [authorization.openshift.io/v1]
  - Role [authorization.openshift.io/v1]
- Schedule and quota APIs
  - About Schedule and quota APIs
  - AppliedClusterResourceQuota [quota.openshift.io/v1]
  - ClusterResourceQuota [quota.openshift.io/v1]
  - FlowSchema [flowcontrol.apiserver.k8s.io/v1beta3]
  - LimitRange [undefined/v1]
  - PriorityClass [scheduling.k8s.io/v1]
  - PriorityLevelConfiguration [flowcontrol.apiserver.k8s.io/v1beta3]
  - ResourceQuota [undefined/v1]
- Security APIs
  - About Security APIs
  - CertificateSigningRequest [certificates.k8s.io/v1]
  - CredentialsRequest [cloudcredential.openshift.io/v1]
  - PodSecurityPolicyReview [security.openshift.io/v1]
  - PodSecurityPolicySelfSubjectReview [security.openshift.io/v1]
  - PodSecurityPolicySubjectReview [security.openshift.io/v1]
  - RangeAllocation [security.openshift.io/v1]
  - Secret [undefined/v1]
  - SecurityContextConstraints [security.openshift.io/v1]
  - ServiceAccount [undefined/v1]
- Storage APIs
  - About Storage APIs
  - CSIDriver [storage.k8s.io/v1]
  - CSINode [storage.k8s.io/v1]
  - CSIStorageCapacity [storage.k8s.io/v1]
  - PersistentVolume [undefined/v1]
  - PersistentVolumeClaim [undefined/v1]
  - StorageClass [storage.k8s.io/v1]
  - StorageState [migration.k8s.io/v1alpha1]
  - StorageVersionMigration [migration.k8s.io/v1alpha1]
  - VolumeAttachment [storage.k8s.io/v1]
  - VolumeSnapshot [snapshot.storage.k8s.io/v1]
  - VolumeSnapshotClass [snapshot.storage.k8s.io/v1]
  - VolumeSnapshotContent [snapshot.storage.k8s.io/v1]
- Template APIs
  - About Template APIs
  - BrokerTemplateInstance [template.openshift.io/v1]
  - PodTemplate [undefined/v1]
  - Template [template.openshift.io/v1]
  - TemplateInstance [template.openshift.io/v1]
- User and group APIs
  - About User and group APIs
  - Group [user.openshift.io/v1]
  - Identity [user.openshift.io/v1]
  - UserIdentityMapping [user.openshift.io/v1]
  - User [user.openshift.io/v1]
- Workloads APIs
  - About Workloads APIs
  - BuildConfig [build.openshift.io/v1]
  - Build [build.openshift.io/v1]
  - BuildLog [build.openshift.io/v1]
  - BuildRequest [build.openshift.io/v1]
  - CronJob [batch/v1]
  - DaemonSet [apps/v1]
  - Deployment [apps/v1]
  - DeploymentConfig [apps.openshift.io/v1]
  - DeploymentConfigRollback [apps.openshift.io/v1]
  - DeploymentLog [apps.openshift.io/v1]
  - DeploymentRequest [apps.openshift.io/v1]
  - Job [batch/v1]
  - Pod [undefined/v1]
  - ReplicationController [undefined/v1]
  - ReplicaSet [apps/v1]
  - StatefulSet [apps/v1]
Virtualization
- About
- Getting started
  - Getting started with OKD Virtualization
  - virtctl and libguestfs
- Installing
- Postinstallation configuration
- Updating
  - Updating OKD Virtualization
- Virtual machines
- Networking
- Storage
- Live migration
- Nodes
- Monitoring
- Support
  - Support overview
  - Troubleshooting
- Backup and restore
  - Backup and restore by using VM snapshots
  - Backing up and restoring virtual machines

Checking the status of an etcd cluster
Recovering a failing etcd pod

In a highly available control plane, three etcd pods run as a part of a stateful set in an etcd cluster. To recover an etcd cluster, identify unhealthy etcd pods by checking the etcd cluster health.

Checking the status of an etcd cluster

You can check the status of the etcd cluster health by logging into any etcd pod.

Procedure

$ oc rsh -n <hosted_control_plane_namespace> -c etcd <etcd_pod_name>

Print the health status of an etcd cluster by entering the following command:

sh-4.4$ etcdctl endpoint health --cluster -w table

Example output

ENDPOINT                                                HEALTH  TOOK        ERROR
https://etcd-0.etcd-discovery.clusters-hosted.svc:2379  true    9.117698ms

Recovering a failing etcd pod

Each etcd pod of a 3-node cluster has its own persistent volume claim (PVC) to store its data. An etcd pod might fail because of corrupted or missing data. You can recover a failing etcd pod and its PVC.

Procedure

To confirm that the etcd pod is failing, enter the following command:

$ oc get pods -l app=etcd -n <hosted_control_plane_namespace>

Example output

NAME     READY   STATUS             RESTARTS     AGE
etcd-0   2/2     Running            0            64m
etcd-1   2/2     Running            0            45m
etcd-2   1/2     CrashLoopBackOff   1 (5s ago)   64m

The failing etcd pod might have the CrashLoopBackOff or Error status.

Delete the failing pod and its PVC by entering the following command:
```
$ oc delete pvc/<etcd_pvc_name> pod/<etcd_pod_name> --wait=false
```

Verification

Verify that a new etcd pod is up and running by entering the following command:

$ oc get pods -l app=etcd -n <hosted_control_plane_namespace>

Example output

NAME     READY   STATUS    RESTARTS   AGE
etcd-0   2/2     Running   0          67m
etcd-1   2/2     Running   0          48m
etcd-2   2/2     Running   0          2m2s

Recovering an unhealthy etcd cluster

Checking the status of an etcd cluster

Recovering a failing etcd pod