After successfully deploying an installer-provisioned cluster, consider the following postinstallation procedures.
OKD installs the chrony
Network Time Protocol (NTP) service on the cluster nodes.
Use the following procedure to configure NTP servers on the control plane nodes and configure compute nodes as NTP clients of the control plane nodes after a successful deployment.
OKD nodes must agree on a date and time to run properly. When compute nodes retrieve the date and time from the NTP servers on the control plane nodes, it enables the installation and operation of clusters that are not connected to a routable network and thereby do not have access to a higher stratum NTP server.
Install Butane on your installation host by using the following command:
$ sudo dnf -y install butane
Create a Butane config, 99-master-chrony-conf-override.bu
, including the contents of the chrony.conf
file for the control plane nodes.
See "Creating machine configs with Butane" for information about Butane. |
variant: openshift
version: 4.17.0
metadata:
name: 99-master-chrony-conf-override
labels:
machineconfiguration.openshift.io/role: master
storage:
files:
- path: /etc/chrony.conf
mode: 0644
overwrite: true
contents:
inline: |
# Use public servers from the pool.ntp.org project.
# Please consider joining the pool (https://www.pool.ntp.org/join.html).
# The Machine Config Operator manages this file
server openshift-master-0.<cluster-name>.<domain> iburst (1)
server openshift-master-1.<cluster-name>.<domain> iburst
server openshift-master-2.<cluster-name>.<domain> iburst
stratumweight 0
driftfile /var/lib/chrony/drift
rtcsync
makestep 10 3
bindcmdaddress 127.0.0.1
bindcmdaddress ::1
keyfile /etc/chrony.keys
commandkey 1
generatecommandkey
noclientlog
logchange 0.5
logdir /var/log/chrony
# Configure the control plane nodes to serve as local NTP servers
# for all compute nodes, even if they are not in sync with an
# upstream NTP server.
# Allow NTP client access from the local network.
allow all
# Serve time even if not synchronized to a time source.
local stratum 3 orphan
1 | You must replace <cluster-name> with the name of the cluster and replace <domain> with the fully qualified domain name. |
Use Butane to generate a MachineConfig
object file, 99-master-chrony-conf-override.yaml
, containing the configuration to be delivered to the control plane nodes:
$ butane 99-master-chrony-conf-override.bu -o 99-master-chrony-conf-override.yaml
Create a Butane config, 99-worker-chrony-conf-override.bu
, including the contents of the chrony.conf
file for the compute nodes that references the NTP servers on the control plane nodes.
variant: openshift
version: 4.17.0
metadata:
name: 99-worker-chrony-conf-override
labels:
machineconfiguration.openshift.io/role: worker
storage:
files:
- path: /etc/chrony.conf
mode: 0644
overwrite: true
contents:
inline: |
# The Machine Config Operator manages this file.
server openshift-master-0.<cluster-name>.<domain> iburst (1)
server openshift-master-1.<cluster-name>.<domain> iburst
server openshift-master-2.<cluster-name>.<domain> iburst
stratumweight 0
driftfile /var/lib/chrony/drift
rtcsync
makestep 10 3
bindcmdaddress 127.0.0.1
bindcmdaddress ::1
keyfile /etc/chrony.keys
commandkey 1
generatecommandkey
noclientlog
logchange 0.5
logdir /var/log/chrony
1 | You must replace <cluster-name> with the name of the cluster and replace <domain> with the fully qualified domain name. |
Use Butane to generate a MachineConfig
object file, 99-worker-chrony-conf-override.yaml
, containing the configuration to be delivered to the worker nodes:
$ butane 99-worker-chrony-conf-override.bu -o 99-worker-chrony-conf-override.yaml
Apply the 99-master-chrony-conf-override.yaml
policy to the control plane nodes.
$ oc apply -f 99-master-chrony-conf-override.yaml
machineconfig.machineconfiguration.openshift.io/99-master-chrony-conf-override created
Apply the 99-worker-chrony-conf-override.yaml
policy to the compute nodes.
$ oc apply -f 99-worker-chrony-conf-override.yaml
machineconfig.machineconfiguration.openshift.io/99-worker-chrony-conf-override created
Check the status of the applied NTP settings.
$ oc describe machineconfigpool
The assisted installer and installer-provisioned installation for bare metal clusters provide the ability to deploy a cluster without a provisioning
network. This capability is for scenarios such as proof-of-concept clusters or deploying exclusively with Redfish virtual media when each node’s baseboard management controller is routable via the baremetal
network.
You can enable a provisioning
network after installation using the Cluster Baremetal Operator (CBO).
A dedicated physical network must exist, connected to all worker and control plane nodes.
You must isolate the native, untagged physical network.
The network cannot have a DHCP server when the provisioningNetwork
configuration setting is set to Managed
.
You can omit the provisioningInterface
setting in OKD 4.10 to use the bootMACAddress
configuration setting.
When setting the provisioningInterface
setting, first identify the provisioning interface name for the cluster nodes. For example, eth0
or eno1
.
Enable the Preboot eXecution Environment (PXE) on the provisioning
network interface of the cluster nodes.
Retrieve the current state of the provisioning
network and save it to a provisioning custom resource (CR) file:
$ oc get provisioning -o yaml > enable-provisioning-nw.yaml
Modify the provisioning CR file:
$ vim ~/enable-provisioning-nw.yaml
Scroll down to the provisioningNetwork
configuration setting and change it from Disabled
to Managed
. Then, add the provisioningIP
, provisioningNetworkCIDR
, provisioningDHCPRange
, provisioningInterface
, and watchAllNameSpaces
configuration settings after the provisioningNetwork
setting. Provide appropriate values for each setting.
apiVersion: v1
items:
- apiVersion: metal3.io/v1alpha1
kind: Provisioning
metadata:
name: provisioning-configuration
spec:
provisioningNetwork: (1)
provisioningIP: (2)
provisioningNetworkCIDR: (3)
provisioningDHCPRange: (4)
provisioningInterface: (5)
watchAllNameSpaces: (6)
1 | The provisioningNetwork is one of Managed , Unmanaged , or Disabled . When set to Managed , Metal3 manages the provisioning network and the CBO deploys the Metal3 pod with a configured DHCP server. When set to Unmanaged , the system administrator configures the DHCP server manually. |
2 | The provisioningIP is the static IP address that the DHCP server and ironic use to provision the network. This static IP address must be within the provisioning subnet, and outside of the DHCP range. If you configure this setting, it must have a valid IP address even if the provisioning network is Disabled . The static IP address is bound to the metal3 pod. If the metal3 pod fails and moves to another server, the static IP address also moves to the new server. |
3 | The Classless Inter-Domain Routing (CIDR) address. If you configure this setting, it must have a valid CIDR address even if the provisioning network is Disabled . For example: 192.168.0.1/24 . |
4 | The DHCP range. This setting is only applicable to a Managed provisioning network. Omit this configuration setting if the provisioning network is Disabled . For example: 192.168.0.64, 192.168.0.253 . |
5 | The NIC name for the provisioning interface on cluster nodes. The provisioningInterface setting is only applicable to Managed and Unmanaged provisioning networks. Omit the provisioningInterface configuration setting if the provisioning network is Disabled . Omit the provisioningInterface configuration setting to use the bootMACAddress configuration setting instead. |
6 | Set this setting to true if you want metal3 to watch namespaces other than the default openshift-machine-api namespace. The default value is false . |
Save the changes to the provisioning CR file.
Apply the provisioning CR file to the cluster:
$ oc apply -f enable-provisioning-nw.yaml
br-ex
bridgeAs an alternative to using the configure-ovs.sh
shell script to set a customized br-ex
bridge on a bare-metal platform, you can create a NodeNetworkConfigurationPolicy
custom resource (CR) that includes a customized br-ex
bridge network configuration.
Creating a For more information about the support scope of Red Hat Technology Preview features, see Technology Preview Features Support Scope. |
This feature supports the following tasks:
Modifying the maximum transmission unit (MTU) for your cluster.
Modifying attributes of a different bond interface, such as MIImon (Media Independent Interface Monitor), bonding mode, or Quality of Service (QoS).
Updating DNS values.
Consider the following use cases for creating a manifest object that includes a customized br-ex
bridge:
You want to make postinstallation changes to the bridge, such as changing the Open vSwitch (OVS) or OVN-Kubernetes br-ex
bridge network. The configure-ovs.sh
shell script does not support making postinstallation changes to the bridge.
You want to deploy the bridge on a different interface than the interface available on a host or server IP address.
You want to make advanced configurations to the bridge that are not possible with the configure-ovs.sh
shell script. Using the script for these configurations might result in the bridge failing to connect multiple network interfaces and facilitating data forwarding between the interfaces.
You set a customized br-ex
by using the alternative method to configure-ovs
.
You installed the Kubernetes NMState Operator.
Create a NodeNetworkConfigurationPolicy
(NNCP) CR and define a customized br-ex
bridge network configuration. Depending on your needs, ensure that you set a masquerade IP for either the ipv4.address.ip
, ipv6.address.ip
, or both parameters. A masquerade IP address must match an in-use IP address block.
As a post-installation task, you can configure most parameters for a customized |
apiVersion: nmstate.io/v1
kind: NodeNetworkConfigurationPolicy
metadata:
name: worker-0-br-ex (1)
spec:
nodeSelector:
kubernetes.io/hostname: worker-0
desiredState:
interfaces:
- name: enp2s0 (2)
type: ethernet (3)
state: up (4)
ipv4:
enabled: false (5)
ipv6:
enabled: false
- name: br-ex
type: ovs-bridge
state: up
ipv4:
enabled: false
dhcp: false
ipv6:
enabled: false
dhcp: false
bridge:
port:
- name: enp2s0 (6)
- name: br-ex
- name: br-ex
type: ovs-interface
state: up
copy-mac-from: enp2s0
ipv4:
enabled: true
dhcp: true
address:
- ip: "169.254.169.2"
prefix-length: 29
ipv6:
enabled: false
dhcp: false
address:
- ip: "fd69::2"
prefix-length: 125
1 | Name of the policy. |
2 | Name of the interface. |
3 | The type of ethernet. |
4 | The requested state for the interface after creation. |
5 | Disables IPv4 and IPv6 in this example. |
6 | The node NIC to which the bridge is attached. |
You can configure an OKD cluster to use a user-managed load balancer in place of the default load balancer.
Configuring a user-managed load balancer depends on your vendor’s load balancer. The information and examples in this section are for guideline purposes only. Consult the vendor documentation for more specific information about the vendor’s load balancer. |
Red Hat supports the following services for a user-managed load balancer:
Ingress Controller
OpenShift API
OpenShift MachineConfig API
You can choose whether you want to configure one or all of these services for a user-managed load balancer. Configuring only the Ingress Controller service is a common configuration option. To better understand each service, view the following diagrams:
The following configuration options are supported for user-managed load balancers:
Use a node selector to map the Ingress Controller to a specific set of nodes. You must assign a static IP address to each node in this set, or configure each node to receive the same IP address from the Dynamic Host Configuration Protocol (DHCP). Infrastructure nodes commonly receive this type of configuration.
Target all IP addresses on a subnet. This configuration can reduce maintenance overhead, because you can create and destroy nodes within those networks without reconfiguring the load balancer targets. If you deploy your ingress pods by using a machine set on a smaller network, such as a /27
or /28
, you can simplify your load balancer targets.
You can list all IP addresses that exist in a network by checking the machine config pool’s resources. |
Before you configure a user-managed load balancer for your OKD cluster, consider the following information:
For a front-end IP address, you can use the same IP address for the front-end IP address, the Ingress Controller’s load balancer, and API load balancer. Check the vendor’s documentation for this capability.
For a back-end IP address, ensure that an IP address for an OKD control plane node does not change during the lifetime of the user-managed load balancer. You can achieve this by completing one of the following actions:
Assign a static IP address to each control plane node.
Configure each node to receive the same IP address from the DHCP every time the node requests a DHCP lease. Depending on the vendor, the DHCP lease might be in the form of an IP reservation or a static DHCP assignment.
Manually define each node that runs the Ingress Controller in the user-managed load balancer for the Ingress Controller back-end service. For example, if the Ingress Controller moves to an undefined node, a connection outage can occur.
You can configure an OKD cluster to use a user-managed load balancer in place of the default load balancer.
Before you configure a user-managed load balancer, ensure that you read the "Services for a user-managed load balancer" section. |
Read the following prerequisites that apply to the service that you want to configure for your user-managed load balancer.
MetalLB, which runs on a cluster, functions as a user-managed load balancer. |
You defined a front-end IP address.
TCP ports 6443 and 22623 are exposed on the front-end IP address of your load balancer. Check the following items:
Port 6443 provides access to the OpenShift API service.
Port 22623 can provide ignition startup configurations to nodes.
The front-end IP address and port 6443 are reachable by all users of your system with a location external to your OKD cluster.
The front-end IP address and port 22623 are reachable only by OKD nodes.
The load balancer backend can communicate with OKD control plane nodes on port 6443 and 22623.
You defined a front-end IP address.
TCP ports 443 and 80 are exposed on the front-end IP address of your load balancer.
The front-end IP address, port 80 and port 443 are be reachable by all users of your system with a location external to your OKD cluster.
The front-end IP address, port 80 and port 443 are reachable to all nodes that operate in your OKD cluster.
The load balancer backend can communicate with OKD nodes that run the Ingress Controller on ports 80, 443, and 1936.
You can configure most load balancers by setting health check URLs that determine if a service is available or unavailable. OKD provides these health checks for the OpenShift API, Machine Configuration API, and Ingress Controller backend services.
The following examples show health check specifications for the previously listed backend services:
Path: HTTPS:6443/readyz
Healthy threshold: 2
Unhealthy threshold: 2
Timeout: 10
Interval: 10
Path: HTTPS:22623/healthz
Healthy threshold: 2
Unhealthy threshold: 2
Timeout: 10
Interval: 10
Path: HTTP:1936/healthz/ready
Healthy threshold: 2
Unhealthy threshold: 2
Timeout: 5
Interval: 10
Configure the HAProxy Ingress Controller, so that you can enable access to the cluster from your load balancer on ports 6443, 22623, 443, and 80. Depending on your needs, you can specify the IP address of a single subnet or IP addresses from multiple subnets in your HAProxy configuration.
# ...
listen my-cluster-api-6443
bind 192.168.1.100:6443
mode tcp
balance roundrobin
option httpchk
http-check connect
http-check send meth GET uri /readyz
http-check expect status 200
server my-cluster-master-2 192.168.1.101:6443 check inter 10s rise 2 fall 2
server my-cluster-master-0 192.168.1.102:6443 check inter 10s rise 2 fall 2
server my-cluster-master-1 192.168.1.103:6443 check inter 10s rise 2 fall 2
listen my-cluster-machine-config-api-22623
bind 192.168.1.100:22623
mode tcp
balance roundrobin
option httpchk
http-check connect
http-check send meth GET uri /healthz
http-check expect status 200
server my-cluster-master-2 192.168.1.101:22623 check inter 10s rise 2 fall 2
server my-cluster-master-0 192.168.1.102:22623 check inter 10s rise 2 fall 2
server my-cluster-master-1 192.168.1.103:22623 check inter 10s rise 2 fall 2
listen my-cluster-apps-443
bind 192.168.1.100:443
mode tcp
balance roundrobin
option httpchk
http-check connect
http-check send meth GET uri /healthz/ready
http-check expect status 200
server my-cluster-worker-0 192.168.1.111:443 check port 1936 inter 10s rise 2 fall 2
server my-cluster-worker-1 192.168.1.112:443 check port 1936 inter 10s rise 2 fall 2
server my-cluster-worker-2 192.168.1.113:443 check port 1936 inter 10s rise 2 fall 2
listen my-cluster-apps-80
bind 192.168.1.100:80
mode tcp
balance roundrobin
option httpchk
http-check connect
http-check send meth GET uri /healthz/ready
http-check expect status 200
server my-cluster-worker-0 192.168.1.111:80 check port 1936 inter 10s rise 2 fall 2
server my-cluster-worker-1 192.168.1.112:80 check port 1936 inter 10s rise 2 fall 2
server my-cluster-worker-2 192.168.1.113:80 check port 1936 inter 10s rise 2 fall 2
# ...
# ...
listen api-server-6443
bind *:6443
mode tcp
server master-00 192.168.83.89:6443 check inter 1s
server master-01 192.168.84.90:6443 check inter 1s
server master-02 192.168.85.99:6443 check inter 1s
server bootstrap 192.168.80.89:6443 check inter 1s
listen machine-config-server-22623
bind *:22623
mode tcp
server master-00 192.168.83.89:22623 check inter 1s
server master-01 192.168.84.90:22623 check inter 1s
server master-02 192.168.85.99:22623 check inter 1s
server bootstrap 192.168.80.89:22623 check inter 1s
listen ingress-router-80
bind *:80
mode tcp
balance source
server worker-00 192.168.83.100:80 check inter 1s
server worker-01 192.168.83.101:80 check inter 1s
listen ingress-router-443
bind *:443
mode tcp
balance source
server worker-00 192.168.83.100:443 check inter 1s
server worker-01 192.168.83.101:443 check inter 1s
listen ironic-api-6385
bind *:6385
mode tcp
balance source
server master-00 192.168.83.89:6385 check inter 1s
server master-01 192.168.84.90:6385 check inter 1s
server master-02 192.168.85.99:6385 check inter 1s
server bootstrap 192.168.80.89:6385 check inter 1s
listen inspector-api-5050
bind *:5050
mode tcp
balance source
server master-00 192.168.83.89:5050 check inter 1s
server master-01 192.168.84.90:5050 check inter 1s
server master-02 192.168.85.99:5050 check inter 1s
server bootstrap 192.168.80.89:5050 check inter 1s
# ...
Use the curl
CLI command to verify that the user-managed load balancer and its resources are operational:
Verify that the cluster machine configuration API is accessible to the Kubernetes API server resource, by running the following command and observing the response:
$ curl https://<loadbalancer_ip_address>:6443/version --insecure
If the configuration is correct, you receive a JSON object in response:
{
"major": "1",
"minor": "11+",
"gitVersion": "v1.11.0+ad103ed",
"gitCommit": "ad103ed",
"gitTreeState": "clean",
"buildDate": "2019-01-09T06:44:10Z",
"goVersion": "go1.10.3",
"compiler": "gc",
"platform": "linux/amd64"
}
Verify that the cluster machine configuration API is accessible to the Machine config server resource, by running the following command and observing the output:
$ curl -v https://<loadbalancer_ip_address>:22623/healthz --insecure
If the configuration is correct, the output from the command shows the following response:
HTTP/1.1 200 OK
Content-Length: 0
Verify that the controller is accessible to the Ingress Controller resource on port 80, by running the following command and observing the output:
$ curl -I -L -H "Host: console-openshift-console.apps.<cluster_name>.<base_domain>" http://<load_balancer_front_end_IP_address>
If the configuration is correct, the output from the command shows the following response:
HTTP/1.1 302 Found
content-length: 0
location: https://console-openshift-console.apps.ocp4.private.opequon.net/
cache-control: no-cache
Verify that the controller is accessible to the Ingress Controller resource on port 443, by running the following command and observing the output:
$ curl -I -L --insecure --resolve console-openshift-console.apps.<cluster_name>.<base_domain>:443:<Load Balancer Front End IP Address> https://console-openshift-console.apps.<cluster_name>.<base_domain>
If the configuration is correct, the output from the command shows the following response:
HTTP/1.1 200 OK
referrer-policy: strict-origin-when-cross-origin
set-cookie: csrf-token=UlYWOyQ62LWjw2h003xtYSKlh1a0Py2hhctw0WmV2YEdhJjFyQwWcGBsja261dGLgaYO0nxzVErhiXt6QepA7g==; Path=/; Secure; SameSite=Lax
x-content-type-options: nosniff
x-dns-prefetch-control: off
x-frame-options: DENY
x-xss-protection: 1; mode=block
date: Wed, 04 Oct 2023 16:29:38 GMT
content-type: text/html; charset=utf-8
set-cookie: 1e2670d92730b515ce3a1bb65da45062=1bf5e9573c9a2760c964ed1659cc1673; path=/; HttpOnly; Secure; SameSite=None
cache-control: private
Configure the DNS records for your cluster to target the front-end IP addresses of the user-managed load balancer. You must update records to your DNS server for the cluster API and applications over the load balancer.
<load_balancer_ip_address> A api.<cluster_name>.<base_domain>
A record pointing to Load Balancer Front End
<load_balancer_ip_address> A apps.<cluster_name>.<base_domain>
A record pointing to Load Balancer Front End
DNS propagation might take some time for each DNS record to become available. Ensure that each DNS record propagates before validating each record. |
For your OKD cluster to use the user-managed load balancer, you must specify the following configuration in your cluster’s install-config.yaml
file:
# ...
platform:
loadBalancer:
type: UserManaged (1)
apiVIPs:
- <api_ip> (2)
ingressVIPs:
- <ingress_ip> (3)
# ...
1 | Set UserManaged for the type parameter to specify a user-managed load balancer for your cluster. The parameter defaults to OpenShiftManagedDefault , which denotes the default internal load balancer. For services defined in an openshift-kni-infra namespace, a user-managed load balancer can deploy the coredns service to pods in your cluster but ignores keepalived and haproxy services. |
2 | Required parameter when you specify a user-managed load balancer. Specify the user-managed load balancer’s public IP address, so that the Kubernetes API can communicate with the user-managed load balancer. |
3 | Required parameter when you specify a user-managed load balancer. Specify the user-managed load balancer’s public IP address, so that the user-managed load balancer can manage ingress traffic for your cluster. |
Use the curl
CLI command to verify that the user-managed load balancer and DNS record configuration are operational:
Verify that you can access the cluster API, by running the following command and observing the output:
$ curl https://api.<cluster_name>.<base_domain>:6443/version --insecure
If the configuration is correct, you receive a JSON object in response:
{
"major": "1",
"minor": "11+",
"gitVersion": "v1.11.0+ad103ed",
"gitCommit": "ad103ed",
"gitTreeState": "clean",
"buildDate": "2019-01-09T06:44:10Z",
"goVersion": "go1.10.3",
"compiler": "gc",
"platform": "linux/amd64"
}
Verify that you can access the cluster machine configuration, by running the following command and observing the output:
$ curl -v https://api.<cluster_name>.<base_domain>:22623/healthz --insecure
If the configuration is correct, the output from the command shows the following response:
HTTP/1.1 200 OK
Content-Length: 0
Verify that you can access each cluster application on port, by running the following command and observing the output:
$ curl http://console-openshift-console.apps.<cluster_name>.<base_domain> -I -L --insecure
If the configuration is correct, the output from the command shows the following response:
HTTP/1.1 302 Found
content-length: 0
location: https://console-openshift-console.apps.<cluster-name>.<base domain>/
cache-control: no-cacheHTTP/1.1 200 OK
referrer-policy: strict-origin-when-cross-origin
set-cookie: csrf-token=39HoZgztDnzjJkq/JuLJMeoKNXlfiVv2YgZc09c3TBOBU4NI6kDXaJH1LdicNhN1UsQWzon4Dor9GWGfopaTEQ==; Path=/; Secure
x-content-type-options: nosniff
x-dns-prefetch-control: off
x-frame-options: DENY
x-xss-protection: 1; mode=block
date: Tue, 17 Nov 2020 08:42:10 GMT
content-type: text/html; charset=utf-8
set-cookie: 1e2670d92730b515ce3a1bb65da45062=9b714eb87e93cf34853e87a92d6894be; path=/; HttpOnly; Secure; SameSite=None
cache-control: private
Verify that you can access each cluster application on port 443, by running the following command and observing the output:
$ curl https://console-openshift-console.apps.<cluster_name>.<base_domain> -I -L --insecure
If the configuration is correct, the output from the command shows the following response:
HTTP/1.1 200 OK
referrer-policy: strict-origin-when-cross-origin
set-cookie: csrf-token=UlYWOyQ62LWjw2h003xtYSKlh1a0Py2hhctw0WmV2YEdhJjFyQwWcGBsja261dGLgaYO0nxzVErhiXt6QepA7g==; Path=/; Secure; SameSite=Lax
x-content-type-options: nosniff
x-dns-prefetch-control: off
x-frame-options: DENY
x-xss-protection: 1; mode=block
date: Wed, 04 Oct 2023 16:29:38 GMT
content-type: text/html; charset=utf-8
set-cookie: 1e2670d92730b515ce3a1bb65da45062=1bf5e9573c9a2760c964ed1659cc1673; path=/; HttpOnly; Secure; SameSite=None
cache-control: private
When deploying OKD on bare-metal hosts, there are times when you need to make changes to the host either before or after provisioning. This can include inspecting the host’s hardware, firmware, and firmware details. It can also include formatting disks or changing modifiable firmware settings.
You can use the Bare Metal Operator (BMO) to provision, manage, and inspect bare-metal hosts in your cluster. The BMO can complete the following operations:
Provision bare-metal hosts to the cluster with a specific image.
Turn on or off a host.
Inspect hardware details of the host and report them to the bare-metal host.
Upgrade or downgrade a host’s firmware to a specific version.
Inspect firmware and configure BIOS settings.
Clean disk contents for the host before or after provisioning the host.
The BMO uses the following resources to complete these tasks:
BareMetalHost
HostFirmwareSettings
FirmwareSchema
HostFirmwareComponents
The BMO maintains an inventory of the physical hosts in the cluster by mapping each bare-metal host to an instance of the BareMetalHost
custom resource definition. Each BareMetalHost
resource features hardware, software, and firmware details. The BMO continually inspects the bare-metal hosts in the cluster to ensure each BareMetalHost
resource accurately details the components of the corresponding host.
The BMO also uses the HostFirmwareSettings
resource, the FirmwareSchema
resource, and the HostFirmwareComponents
resource to detail firmware specifications and upgrade or downgrade firmware for the bare-metal host.
The BMO interfaces with bare-metal hosts in the cluster by using the Ironic API service. The Ironic service uses the Baseboard Management Controller (BMC) on the host to interface with the machine.
The Bare Metal Operator (BMO) uses the following resources to provision, manage, and inspect bare-metal hosts in your cluster. The following diagram illustrates the architecture of these resources:
The BareMetalHost
resource defines a physical host and its properties. When you provision a bare-metal host to the cluster, you must define a BareMetalHost
resource for that host. For ongoing management of the host, you can inspect the information in the BareMetalHost
or update this information.
The BareMetalHost
resource features provisioning information such as the following:
Deployment specifications such as the operating system boot image or the custom RAM disk
Provisioning state
Baseboard Management Controller (BMC) address
Desired power state
The BareMetalHost
resource features hardware information such as the following:
Number of CPUs
MAC address of a NIC
Size of the host’s storage device
Current power state
You can use the HostFirmwareSettings
resource to retrieve and manage the firmware settings for a host. When a host moves to the Available
state, the Ironic service reads the host’s firmware settings and creates the HostFirmwareSettings
resource. There is a one-to-one mapping between the BareMetalHost
resource and the HostFirmwareSettings
resource.
You can use the HostFirmwareSettings
resource to inspect the firmware specifications for a host or to update a host’s firmware specifications.
You must adhere to the schema specific to the vendor firmware when you edit the |
Firmware settings vary among hardware vendors and host models. A FirmwareSchema
resource is a read-only resource that contains the types and limits for each firmware setting on each host model. The data comes directly from the BMC by using the Ironic service. The FirmwareSchema
resource enables you to identify valid values you can specify in the spec
field of the HostFirmwareSettings
resource.
A FirmwareSchema
resource can apply to many BareMetalHost
resources if the schema is the same.
Metal3 provides the HostFirmwareComponents
resource, which describes BIOS and baseboard management controller (BMC) firmware versions. You can upgrade or downgrade the host’s firmware to a specific version by editing the spec
field of the HostFirmwareComponents
resource. This is useful when deploying with validated patterns that have been tested against specific firmware versions.
Metal3 introduces the concept of the BareMetalHost
resource, which defines a physical host and its properties. The BareMetalHost
resource contains two sections:
The BareMetalHost
spec
The BareMetalHost
status
The spec
section of the BareMetalHost
resource defines the desired state of the host.
Parameters | Description | ||
---|---|---|---|
|
An interface to enable or disable automated cleaning during provisioning and de-provisioning. When set to |
||
bmc: address: credentialsName: disableCertificateVerification: |
The
|
||
|
The MAC address of the NIC used for provisioning the host. |
||
|
The boot mode of the host. It defaults to |
||
|
A reference to another resource that is using the host. It could be empty if another resource is not currently using the host. For example, a |
||
|
A human-provided string to help identify the host. |
||
|
A boolean indicating whether the host provisioning and deprovisioning are managed externally. When set:
|
||
|
Contains information about the BIOS configuration of bare metal hosts. Currently,
|
||
image: url: checksum: checksumType: format: |
The
|
||
|
A reference to the secret containing the network configuration data and its namespace, so that it can be attached to the host before the host boots to set up the network. |
||
|
A boolean indicating whether the host should be powered on ( |
||
raid: hardwareRAIDVolumes: softwareRAIDVolumes: |
(Optional) Contains the information about the RAID configuration for bare metal hosts. If not specified, it retains the current configuration.
See the following configuration settings:
You can set the spec: raid: hardwareRAIDVolume: [] If you receive an error message indicating that the driver does not support RAID, set the |
||
rootDeviceHints: deviceName: hctl: model: vendor: serialNumber: minSizeGigabytes: wwn: wwnWithExtension: wwnVendorExtension: rotational: |
The
|
The BareMetalHost
status represents the host’s current state, and includes tested credentials, current hardware details, and other information.
Parameters | Description |
---|---|
|
A reference to the secret and its namespace holding the last set of baseboard management controller (BMC) credentials the system was able to validate as working. |
|
Details of the last error reported by the provisioning backend, if any. |
|
Indicates the class of problem that has caused the host to enter an error state. The error types are:
|
hardware: cpu arch: model: clockMegahertz: flags: count: |
The
|
hardware: firmware: |
Contains BIOS firmware information. For example, the hardware vendor and version. |
hardware: nics: - ip: name: mac: speedGbps: vlans: vlanId: pxe: |
The
|
hardware: ramMebibytes: |
The host’s amount of memory in Mebibytes (MiB). |
hardware: storage: - name: rotational: sizeBytes: serialNumber: |
The
|
hardware: systemVendor: manufacturer: productName: serialNumber: |
Contains information about the host’s |
|
The timestamp of the last time the status of the host was updated. |
|
The status of the server. The status is one of the following:
|
|
Boolean indicating whether the host is powered on. |
provisioning: state: id: image: raid: firmware: rootDeviceHints: |
The
|
|
A reference to the secret and its namespace holding the last set of BMC credentials that were sent to the provisioning backend. |
The BareMetalHost
resource contains the properties of a physical host. You must get the BareMetalHost
resource for a physical host to review its properties.
Get the list of BareMetalHost
resources:
$ oc get bmh -n openshift-machine-api -o yaml
You can use |
Get the list of hosts:
$ oc get bmh -n openshift-machine-api
Get the BareMetalHost
resource for a specific host:
$ oc get bmh <host_name> -n openshift-machine-api -o yaml
Where <host_name>
is the name of the host.
apiVersion: metal3.io/v1alpha1
kind: BareMetalHost
metadata:
creationTimestamp: "2022-06-16T10:48:33Z"
finalizers:
- baremetalhost.metal3.io
generation: 2
name: openshift-worker-0
namespace: openshift-machine-api
resourceVersion: "30099"
uid: 1513ae9b-e092-409d-be1b-ad08edeb1271
spec:
automatedCleaningMode: metadata
bmc:
address: redfish://10.46.61.19:443/redfish/v1/Systems/1
credentialsName: openshift-worker-0-bmc-secret
disableCertificateVerification: true
bootMACAddress: 48:df:37:c7:f7:b0
bootMode: UEFI
consumerRef:
apiVersion: machine.openshift.io/v1beta1
kind: Machine
name: ocp-edge-958fk-worker-0-nrfcg
namespace: openshift-machine-api
customDeploy:
method: install_coreos
online: true
rootDeviceHints:
deviceName: /dev/disk/by-id/scsi-<serial_number>
userData:
name: worker-user-data-managed
namespace: openshift-machine-api
status:
errorCount: 0
errorMessage: ""
goodCredentials:
credentials:
name: openshift-worker-0-bmc-secret
namespace: openshift-machine-api
credentialsVersion: "16120"
hardware:
cpu:
arch: x86_64
clockMegahertz: 2300
count: 64
flags:
- 3dnowprefetch
- abm
- acpi
- adx
- aes
model: Intel(R) Xeon(R) Gold 5218 CPU @ 2.30GHz
firmware:
bios:
date: 10/26/2020
vendor: HPE
version: U30
hostname: openshift-worker-0
nics:
- mac: 48:df:37:c7:f7:b3
model: 0x8086 0x1572
name: ens1f3
ramMebibytes: 262144
storage:
- hctl: "0:0:0:0"
model: VK000960GWTTB
name: /dev/disk/by-id/scsi-<serial_number>
sizeBytes: 960197124096
type: SSD
vendor: ATA
systemVendor:
manufacturer: HPE
productName: ProLiant DL380 Gen10 (868703-B21)
serialNumber: CZ200606M3
lastUpdated: "2022-06-16T11:41:42Z"
operationalStatus: OK
poweredOn: true
provisioning:
ID: 217baa14-cfcf-4196-b764-744e184a3413
bootMode: UEFI
customDeploy:
method: install_coreos
image:
url: ""
raid:
hardwareRAIDVolumes: null
softwareRAIDVolumes: []
rootDeviceHints:
deviceName: /dev/disk/by-id/scsi-<serial_number>
state: provisioned
triedCredentials:
credentials:
name: openshift-worker-0-bmc-secret
namespace: openshift-machine-api
credentialsVersion: "16120"
After you deploy an OKD cluster on bare metal, you might need to edit a node’s BareMetalHost
resource. Consider the following examples:
You deploy a cluster with the Assisted Installer and need to add or edit the baseboard management controller (BMC) host name or IP address.
You want to move a node from one cluster to another without deprovisioning it.
Ensure the node is in the Provisioned
, ExternallyProvisioned
, or Available
state.
Get the list of nodes:
$ oc get bmh -n openshift-machine-api
Before editing the node’s BareMetalHost
resource, detach the node from Ironic by running the following command:
$ oc annotate baremetalhost <node_name> -n openshift-machine-api 'baremetalhost.metal3.io/detached=true' (1)
1 | Replace <node_name> with the name of the node. |
Edit the BareMetalHost
resource by running the following command:
$ oc edit bmh <node_name> -n openshift-machine-api
Reattach the node to Ironic by running the following command:
$ oc annotate baremetalhost <node_name> -n openshift-machine-api 'baremetalhost.metal3.io/detached'-
When the Bare Metal Operator (BMO) deletes a BareMetalHost
resource, Ironic deprovisions the bare-metal host with a process called cleaning. When cleaning fails, Ironic retries the cleaning process three times, which is the source of the latency. The cleaning process might not succeed, causing the provisioning status of the bare-metal host to remain in the deleting state indefinitely. When this occurs, use the following procedure to disable the cleaning process.
Do not remove finalizers from the |
If the cleaning process fails and restarts, wait for it to finish. This might take about 5 minutes.
If the provisioning status remains in the deleting state, disable the cleaning process by modifying the BareMetalHost
resource and setting the automatedCleaningMode
field to disabled
.
See "Editing a BareMetalHost resource" for additional details.
You can attach a generic, non-bootable ISO virtual media image to a provisioned node by using the DataImage
resource. After you apply the resource, the ISO image becomes accessible to the operating system after it has booted. This is useful for configuring a node after provisioning the operating system and before the node boots for the first time.
The node must use Redfish or drivers derived from it to support this feature.
The node must be in the Provisioned
or ExternallyProvisioned
state.
The name
must be the same as the name of the node defined in its BareMetalHost
resource.
You have a valid url
to the ISO image.
Create a DataImage
resource:
apiVersion: metal3.io/v1alpha1
kind: DataImage
metadata:
name: <node_name> (1)
spec:
url: "http://dataimage.example.com/non-bootable.iso" (2)
1 | Specify the name of the node as defined in its BareMetalHost resource. |
2 | Specify the URL and path to the ISO image. |
Save the DataImage
resource to a file by running the following command:
$ vim <node_name>-dataimage.yaml
Apply the DataImage
resource by running the following command:
$ oc apply -f <node_name>-dataimage.yaml -n <node_namespace> (1)
1 | Replace <node_namespace> so that the namespace matches the namespace for the BareMetalHost resource. For example, openshift-machine-api . |
Reboot the node.
To reboot the node, attach the |
View the DataImage
resource by running the following command:
$ oc get dataimage <node_name> -n openshift-machine-api -o yaml
apiVersion: v1
items:
- apiVersion: metal3.io/v1alpha1
kind: DataImage
metadata:
annotations:
kubectl.kubernetes.io/last-applied-configuration: |
{"apiVersion":"metal3.io/v1alpha1","kind":"DataImage","metadata":{"annotations":{},"name":"bmh-node-1","namespace":"openshift-machine-api"},"spec":{"url":"http://dataimage.example.com/non-bootable.iso"}}
creationTimestamp: "2024-06-10T12:00:00Z"
finalizers:
- dataimage.metal3.io
generation: 1
name: bmh-node-1
namespace: openshift-machine-api
ownerReferences:
- apiVersion: metal3.io/v1alpha1
blockOwnerDeletion: true
controller: true
kind: BareMetalHost
name: bmh-node-1
uid: 046cdf8e-0e97-485a-8866-e62d20e0f0b3
resourceVersion: "21695581"
uid: c5718f50-44b6-4a22-a6b7-71197e4b7b69
spec:
url: http://dataimage.example.com/non-bootable.iso
status:
attachedImage:
url: http://dataimage.example.com/non-bootable.iso
error:
count: 0
message: ""
lastReconciled: "2024-06-10T12:05:00Z"
You can use the HostFirmwareSettings
resource to retrieve and manage the BIOS settings for a host. When a host moves to the Available
state, Ironic reads the host’s BIOS settings and creates the HostFirmwareSettings
resource. The resource contains the complete BIOS configuration returned from the baseboard management controller (BMC). Whereas, the firmware
field in the BareMetalHost
resource returns three vendor-independent fields, the HostFirmwareSettings
resource typically comprises many BIOS settings of vendor-specific fields per host.
The HostFirmwareSettings
resource contains two sections:
The HostFirmwareSettings
spec.
The HostFirmwareSettings
status.
HostFirmwareSettings
specThe spec
section of the HostFirmwareSettings
resource defines the desired state of the host’s BIOS, and it is empty by default. Ironic uses the settings in the spec.settings
section to update the baseboard management controller (BMC) when the host is in the Preparing
state. Use the FirmwareSchema
resource to ensure that you do not send invalid name/value pairs to hosts. See "About the FirmwareSchema resource" for additional details.
spec:
settings:
ProcTurboMode: Disabled(1)
1 | In the foregoing example, the spec.settings section contains a name/value pair that will set the ProcTurboMode BIOS setting to Disabled . |
Integer parameters listed in the |
HostFirmwareSettings
statusThe status
represents the current state of the host’s BIOS.
Parameters | Description |
---|---|
status: conditions: - lastTransitionTime: message: observedGeneration: reason: status: type: |
The
|
status: schema: name: namespace: lastUpdated: |
The
|
status: settings: |
The |
The HostFirmwareSettings
resource contains the vendor-specific BIOS properties of a physical host. You must get the HostFirmwareSettings
resource for a physical host to review its BIOS properties.
Get the detailed list of HostFirmwareSettings
resources:
$ oc get hfs -n openshift-machine-api -o yaml
You can use |
Get the list of HostFirmwareSettings
resources:
$ oc get hfs -n openshift-machine-api
Get the HostFirmwareSettings
resource for a particular host
$ oc get hfs <host_name> -n openshift-machine-api -o yaml
Where <host_name>
is the name of the host.
You can edit the HostFirmwareSettings
of provisioned hosts.
You can only edit hosts when they are in the |
Get the list of HostFirmwareSettings
resources:
$ oc get hfs -n openshift-machine-api
Edit a host’s HostFirmwareSettings
resource:
$ oc edit hfs <host_name> -n openshift-machine-api
Where <host_name>
is the name of a provisioned host. The HostFirmwareSettings
resource will open in the default editor for your terminal.
Add name/value pairs to the spec.settings
section:
spec:
settings:
name: value (1)
1 | Use the FirmwareSchema resource to identify the available settings for the host. You cannot set values that are read-only. |
Save the changes and exit the editor.
Get the host’s machine name:
$ oc get bmh <host_name> -n openshift-machine name
Where <host_name>
is the name of the host. The machine name appears under the CONSUMER
field.
Annotate the machine to delete it from the machineset:
$ oc annotate machine <machine_name> machine.openshift.io/delete-machine=true -n openshift-machine-api
Where <machine_name>
is the name of the machine to delete.
Get a list of nodes and count the number of worker nodes:
$ oc get nodes
Get the machineset:
$ oc get machinesets -n openshift-machine-api
Scale the machineset:
$ oc scale machineset <machineset_name> -n openshift-machine-api --replicas=<n-1>
Where <machineset_name>
is the name of the machineset and <n-1>
is the decremented number of worker nodes.
When the host enters the Available
state, scale up the machineset to make the HostFirmwareSettings
resource changes take effect:
$ oc scale machineset <machineset_name> -n openshift-machine-api --replicas=<n>
Where <machineset_name>
is the name of the machineset and <n>
is the number of worker nodes.
When the user edits the spec.settings
section to make a change to the HostFirmwareSetting
(HFS) resource, the Bare Metal Operator (BMO) validates the change against the FimwareSchema
resource, which is a read-only resource. If the setting is invalid, the BMO will set the Type
value of the status.Condition
setting to False
and also generate an event and store it in the HFS resource. Use the following procedure to verify that the resource is valid.
Get a list of HostFirmwareSetting
resources:
$ oc get hfs -n openshift-machine-api
Verify that the HostFirmwareSettings
resource for a particular host is valid:
$ oc describe hfs <host_name> -n openshift-machine-api
Where <host_name>
is the name of the host.
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal ValidationFailed 2m49s metal3-hostfirmwaresettings-controller Invalid BIOS setting: Setting ProcTurboMode is invalid, unknown enumeration value - Foo
If the response returns |
BIOS settings vary among hardware vendors and host models. A FirmwareSchema
resource is a read-only resource that contains the types and limits for each BIOS setting on each host model. The data comes directly from the BMC through Ironic. The FirmwareSchema
enables you to identify valid values you can specify in the spec
field of the HostFirmwareSettings
resource. The FirmwareSchema
resource has a unique identifier derived from its settings and limits. Identical host models use the same FirmwareSchema
identifier. It is likely that multiple instances of HostFirmwareSettings
use the same FirmwareSchema
.
Parameters | Description |
---|---|
<BIOS_setting_name> attribute_type: allowable_values: lower_bound: upper_bound: min_length: max_length: read_only: unique: |
The
|
Each host model from each vendor has different BIOS settings. When editing the HostFirmwareSettings
resource’s spec
section, the name/value pairs you set must conform to that host’s firmware schema. To ensure you are setting valid name/value pairs, get the FirmwareSchema
for the host and review it.
To get a list of FirmwareSchema
resource instances, execute the following:
$ oc get firmwareschema -n openshift-machine-api
To get a particular FirmwareSchema
instance, execute:
$ oc get firmwareschema <instance_name> -n openshift-machine-api -o yaml
Where <instance_name>
is the name of the schema instance stated in the HostFirmwareSettings
resource (see Table 3).
Metal3 provides the HostFirmwareComponents
resource, which describes BIOS and baseboard management controller (BMC) firmware versions. The HostFirmwareComponents
resource contains two sections:
The HostFirmwareComponents
spec
The HostFirmwareComponents
status
The spec
section of the HostFirmwareComponents
resource defines the desired state of the host’s BIOS and BMC versions.
Parameters | Description |
---|---|
updates: component: url: |
The
|
The status
section of the HostFirmwareComponents
resource returns the current status of the host’s BIOS and BMC versions.
Parameters | Description |
---|---|
components: component: initialVersion: currentVersion: lastVersionFlashed: updatedAt: |
The
|
updates: component: url: |
The
|
The HostFirmwareComponents
resource contains the specific firmware version of the BIOS and baseboard management controller (BMC) of a physical host. You must get the HostFirmwareComponents
resource for a physical host to review the firmware version and status.
Get the detailed list of HostFirmwareComponents
resources:
$ oc get hostfirmwarecomponents -n openshift-machine-api -o yaml
Get the list of HostFirmwareComponents
resources:
$ oc get hostfirmwarecomponents -n openshift-machine-api
Get the HostFirmwareComponents
resource for a particular host:
$ oc get hostfirmwarecomponents <host_name> -n openshift-machine-api -o yaml
Where <host_name>
is the name of the host.
---
apiVersion: metal3.io/v1alpha1
kind: HostFirmwareComponents
metadata:
creationTimestamp: 2024-04-25T20:32:06Z"
generation: 1
name: ostest-master-2
namespace: openshift-machine-api
ownerReferences:
- apiVersion: metal3.io/v1alpha1
blockOwnerDeletion: true
controller: true
kind: BareMetalHost
name: ostest-master-2
uid: 16022566-7850-4dc8-9e7d-f216211d4195
resourceVersion: "2437"
uid: 2038d63f-afc0-4413-8ffe-2f8e098d1f6c
spec:
updates: []
status:
components:
- component: bios
currentVersion: 1.0.0
initialVersion: 1.0.0
- component: bmc
currentVersion: "1.00"
initialVersion: "1.00"
conditions:
- lastTransitionTime: "2024-04-25T20:32:06Z"
message: ""
observedGeneration: 1
reason: OK
status: "True"
type: Valid
- lastTransitionTime: "2024-04-25T20:32:06Z"
message: ""
observedGeneration: 1
reason: OK
status: "False"
type: ChangeDetected
lastUpdated: "2024-04-25T20:32:06Z"
updates: []
You can edit the HostFirmwareComponents
resource of a node.
Get the detailed list of HostFirmwareComponents
resources:
$ oc get hostfirmwarecomponents -n openshift-machine-api -o yaml
Edit a host’s HostFirmwareComponents
resource:
$ oc edit <host_name> hostfirmwarecomponents -n openshift-machine-api (1)
1 | Where <host_name> is the name of the host. The HostFirmwareComponents resource will open in the default editor for your terminal. |
---
apiVersion: metal3.io/v1alpha1
kind: HostFirmwareComponents
metadata:
creationTimestamp: 2024-04-25T20:32:06Z"
generation: 1
name: ostest-master-2
namespace: openshift-machine-api
ownerReferences:
- apiVersion: metal3.io/v1alpha1
blockOwnerDeletion: true
controller: true
kind: BareMetalHost
name: ostest-master-2
uid: 16022566-7850-4dc8-9e7d-f216211d4195
resourceVersion: "2437"
uid: 2038d63f-afc0-4413-8ffe-2f8e098d1f6c
spec:
updates:
- name: bios (1)
url: https://myurl.with.firmware.for.bios (2)
- name: bmc (3)
url: https://myurl.with.firmware.for.bmc (4)
status:
components:
- component: bios
currentVersion: 1.0.0
initialVersion: 1.0.0
- component: bmc
currentVersion: "1.00"
initialVersion: "1.00"
conditions:
- lastTransitionTime: "2024-04-25T20:32:06Z"
message: ""
observedGeneration: 1
reason: OK
status: "True"
type: Valid
- lastTransitionTime: "2024-04-25T20:32:06Z"
message: ""
observedGeneration: 1
reason: OK
status: "False"
type: ChangeDetected
lastUpdated: "2024-04-25T20:32:06Z"
1 | To set a BIOS version, set the name attribute to bios . |
2 | To set a BIOS version, set the url attribute to the URL for the firmware version of the BIOS. |
3 | To set a BMC version, set the name attribute to bmc . |
4 | To set a BMC version, set the url attribute to the URL for the firmware verison of the BMC. |
Save the changes and exit the editor.
Get the host’s machine name:
$ oc get bmh <host_name> -n openshift-machine name (1)
1 | Where <host_name> is the name of the host. The machine name appears under the CONSUMER field. |
Annotate the machine to delete it from the machine set:
$ oc annotate machine <machine_name> machine.openshift.io/delete-machine=true -n openshift-machine-api (1)
1 | Where <machine_name> is the name of the machine to delete. |
Get a list of nodes and count the number of worker nodes:
$ oc get nodes
Get the machine set:
$ oc get machinesets -n openshift-machine-api
Scale the machine set:
$ oc scale machineset <machineset_name> -n openshift-machine-api --replicas=<n-1> (1)
1 | Where <machineset_name> is the name of the machine set and <n-1> is the decremented number of worker nodes. |
When the host enters the Available
state, scale up the machine set to make the HostFirmwareComponents
resource changes take effect:
$ oc scale machineset <machineset_name> -n openshift-machine-api --replicas=<n> (1)
1 | Where <machineset_name> is the name of the machine set and <n> is the number of worker nodes. |