Reduce the OKD footprint by disabling optional cluster Operators on single-node OpenShift clusters only.
-
Remove all optional Operators except the Marketplace and Node Tuning Operators.
The following sections describe the various OKD components and configurations that you use to configure and deploy clusters to run RAN DU workloads.
You can now configure host firmware settings for managed clusters that you deploy with GitOps ZTP.
Tune host firmware settings for optimal performance during initial cluster deployment.
The managed cluster host firmware settings are available on the hub cluster as BareMetalHost
custom resources (CRs) that are created when you deploy the managed cluster with the SiteConfig
CR and GitOps ZTP.
Hyperthreading must be enabled
Tune all settings for maximum performance.
All settings are expected to be for maximum performance unless tuned for power savings.
You can tune host firmware for power savings at the expense of performance as required.
Enable secure boot. With secure boot enabled, only signed kernel modules are loaded by the kernel. Out-of-tree drivers are not supported.
No reference design updates in this release
You tune the cluster performance by creating a performance profile.
The RAN DU use case requires the cluster to be tuned for low-latency performance. |
The Node Tuning Operator uses the PerformanceProfile
CR to configure the cluster. You need to configure the following settings in the RAN DU profile PerformanceProfile
CR:
Select reserved and isolated cores and ensure that you allocate at least 4 hyperthreads (equivalent to 2 cores) on Intel 3rd Generation Xeon (Ice Lake) 2.20 GHz CPUs or better with firmware tuned for maximum performance.
Set the reserved cpuset
to include both hyperthread siblings for each included core.
Unreserved cores are available as allocatable CPU for scheduling workloads.
Ensure that hyperthread siblings are not split across reserved and isolated cores.
Configure reserved and isolated CPUs to include all threads in all cores based on what you have set as reserved and isolated CPUs.
Set core 0 of each NUMA node to be included in the reserved CPU set.
Set the huge page size to 1G.
You should not add additional workloads to the management partition. Only those pods which are part of the OpenShift management platform should be annotated into the management partition. |
You should use the RT kernel to meet performance requirements. However, you can use the non-RT kernel with a corresponding impact to cluster performance if required.
The number of huge pages that you configure depends on the application workload requirements. Variation in this parameter is expected and allowed.
Variation is expected in the configuration of reserved and isolated CPU sets based on selected hardware and additional components in use on the system. Variation must still meet the specified limits.
Hardware without IRQ affinity support impacts isolated CPUs. To ensure that pods with guaranteed whole CPU QoS have full use of the allocated CPU, all hardware in the server must support IRQ affinity. For more information, see "Finding the effective IRQ affinity setting for a node".
When you enable workload partitioning during cluster deployment with the cpuPartitioningMode: AllNodes
setting, the reserved CPU set in the PerformanceProfile
CR must include enough CPUs for the operating system, interrupts, and OpenShift platform pods.
cgroups v1 is a deprecated feature. Deprecated functionality is still included in OKD and continues to be supported; however, it will be removed in a future release of this product and is not recommended for new deployments. For the most recent list of major functionality that has been deprecated or removed within OKD, refer to the Deprecated and removed features section of the OKD release notes. |
A new version two of the Precision Time Protocol (PTP) fast event REST API is available.
Consumer applications can now subscribe directly to the events REST API in the PTP events producer sidecar.
The PTP fast event REST API v2 is compliant with the O-RAN O-Cloud Notification API Specification for Event Consumers 3.0.
You can change the API version by setting the ptpEventConfig.apiVersion
field in the PtpOperatorConfig
resource.
See "Recommended single-node OpenShift cluster configuration for vDU application workloads" for details of support and configuration of PTP in cluster nodes. The DU node can run in the following modes:
As an ordinary clock (OC) synced to a grandmaster clock or boundary clock (T-BC).
As a grandmaster clock (T-GM) synced from GPS with support for single or dual card E810 NICs.
As dual boundary clocks (one per NIC) with support for E810 NICs.
As a T-BC with a highly available (HA) system clock when there are multiple time sources on different NICs.
Optional: as a boundary clock for radio units (RUs).
Limited to two boundary clocks for dual NIC and HA.
Limited to two card E810 configurations for T-GM.
Configurations are provided for ordinary clock, boundary clock, boundary clock with highly available system clock, and grandmaster clock.
PTP fast event notifications uses ConfigMap
CRs to store PTP event subscriptions.
The PTP events REST API v2 does not have a global subscription for all lower hierarchy resources contained in the resource path. You subscribe consumer applications to the various available event types separately.
No reference design updates in this release
The SR-IOV Operator provisions and configures the SR-IOV CNI and device plugins.
Both netdevice
(kernel VFs) and vfio
(DPDK) devices are supported and applicable to the RAN use models.
Use OKD supported devices
SR-IOV and IOMMU enablement in BIOS: The SR-IOV Network Operator will automatically enable IOMMU on the kernel command line.
SR-IOV VFs do not receive link state updates from the PF. If link down detection is needed you must configure this at the protocol level.
NICs which do not support firmware updates using Secure Boot or kernel lockdown must be pre-configured with sufficient virtual functions (VFs) to support the number of VFs required by the application workload.
You might need to disable the SR-IOV Operator plugin for unsupported NICs using the undocumented |
SR-IOV interfaces with the vfio
driver type are typically used to enable additional secondary networks for applications that require high throughput or low latency.
Customer variation on the configuration and number of SriovNetwork
and SriovNetworkNodePolicy
custom resources (CRs) is expected.
IOMMU kernel command line settings are applied with a MachineConfig
CR at install time. This ensures that the SriovOperator
CR does not cause a reboot of the node when adding them.
SR-IOV support for draining nodes in parallel is not applicable in a single-node OpenShift cluster.
If you exclude the SriovOperatorConfig
CR from your deployment, the CR will not be created automatically.
In scenarios where you pin or restrict workloads to specific nodes, the SR-IOV parallel node drain feature will not result in the rescheduling of pods. In these scenarios, the SR-IOV Operator disables the parallel node drain functionality.
Cluster Logging Operator 6.0 is new in this release. Update your existing implementation to adapt to the new version of the API.
Use logging to collect logs from the far edge node for remote analysis. The recommended log collector is Vector.
Handling logs beyond the infrastructure and audit logs, for example, from the application workload requires additional CPU and network bandwidth based on additional logging rate.
As of OKD 4.14, Vector is the reference log collector.
Use of fluentd in the RAN use model is deprecated. |
No reference design updates in this release
SRIOV-FEC Operator is an optional 3rd party Certified Operator supporting FEC accelerator hardware.
Starting with FEC Operator v2.7.0:
SecureBoot
is supported
The vfio
driver for the PF
requires the usage of vfio-token
that is injected into Pods.
Applications in the pod can pass the VF
token to DPDK by using the EAL parameter --vfio-vf-token
.
The SRIOV-FEC Operator uses CPU cores from the isolated
CPU set.
You can validate FEC readiness as part of the pre-checks for application deployment, for example, by extending the validation policy.
No reference design updates in this release
The Lifecycle Agent provides local lifecycle management services for single-node OpenShift clusters.
The Lifecycle Agent is not applicable in multi-node clusters or single-node OpenShift clusters with an additional worker.
Requires a persistent volume that you create when installing the cluster. See "Configuring a shared container directory between ostree stateroots when using GitOps ZTP" for partition requirements.
No reference design updates in this release
You can create persistent volumes that can be used as PVC
resources by applications with the Local Storage Operator.
The number and type of PV
resources that you create depends on your requirements.
Create backing storage for PV
CRs before creating the PV
.
This can be a partition, a local volume, LVM volume, or full disk.
Refer to the device listing in LocalVolume
CRs by the hardware path used to access each device to ensure correct allocation of disks and partitions.
Logical names (for example, /dev/sda
) are not guaranteed to be consistent across node reboots.
For more information, see the Fedora 9 documentation on device identifiers.
No reference design updates in this release
Logical Volume Manager (LVM) Storage is an optional component. When you use LVM Storage as the storage solution, it replaces the Local Storage Operator. CPU resources are assigned to the management partition as platform overhead. The reference configuration must include one of these storage solutions, but not both. |
LVM Storage provides dynamic provisioning of block and file storage.
LVM Storage creates logical volumes from local devices that can be used as PVC
resources by applications.
Volume expansion and snapshots are also possible.
In single-node OpenShift clusters, persistent storage must be provided by either LVM Storage or local storage, not both.
Volume snapshots are excluded from the reference configuration.
LVM Storage can be used as the local storage implementation for the RAN DU use case. When LVM Storage is used as the storage solution, it replaces the Local Storage Operator, and the CPU required is assigned to the management partition as platform overhead. The reference configuration must include one of these storage solutions but not both.
Ensure that sufficient disks or partitions are available for storage requirements.
No reference design updates in this release
Workload partitioning pins OpenShift platform and Day 2 Operator pods that are part of the DU profile to the reserved CPU set and removes the reserved CPU from node accounting. This leaves all unreserved CPU cores available for user workloads.
Namespace
and Pod
CRs must be annotated to allow the pod to be applied to the management partition
Pods with CPU limits cannot be allocated to the partition. This is because mutation can change the pod QoS.
For more information about the minimum number of CPUs that can be allocated to the management partition, see Node Tuning Operator.
Workload Partitioning pins all management pods to reserved cores. A sufficient number of cores must be allocated to the reserved set to account for operating system, management pods, and expected spikes in CPU use that occur when the workload starts, the node reboots, or other system events happen.
No reference design updates in this release
See "Cluster capabilities" for a full list of optional components that you can enable or disable before installation.
Cluster capabilities are not available for installer-provisioned installation methods.
You must apply all platform tuning configurations. The following table lists the required platform tuning configurations:
Feature | Description | ||
---|---|---|---|
Remove optional cluster capabilities |
Reduce the OKD footprint by disabling optional cluster Operators on single-node OpenShift clusters only.
|
||
Configure cluster monitoring |
Configure the monitoring stack for reduced footprint by doing the following:
|
||
Disable networking diagnostics |
Disable networking diagnostics for single-node OpenShift because they are not required. |
||
Configure a single OperatorHub catalog source |
Configure the cluster to use a single catalog source that contains only the Operators required for a RAN DU deployment.
Each catalog source increases the CPU use on the cluster.
Using a single |
||
Disable the Console Operator |
If the cluster was deployed with the console disabled, the |
In OKD 4.16 and later, clusters do not automatically revert to cgroups v1 when a PerformanceProfile
CR is applied.
If workloads running on the cluster require cgroups v1, you need to configure the cluster to use cgroups v1.
If you need to configure cgroups v1, make the configuration as part of the initial cluster deployment. |
No reference design updates in this release
The CRI-O wipe disable MachineConfig
assumes that images on disk are static other than during scheduled maintenance in defined maintenance windows.
To ensure the images are static, do not set the pod imagePullPolicy
field to Always
.
Feature | Description | ||
---|---|---|---|
Container runtime |
Sets the container runtime to |
||
kubelet config and container mount hiding |
Reduces the frequency of kubelet housekeeping and eviction monitoring to reduce CPU usage. Create a container mount namespace, visible to kubelet and CRI-O, to reduce system mount scanning resource usage. |
||
SCTP |
Optional configuration (enabled by default) Enables SCTP. SCTP is required by RAN applications but disabled by default in FCOS. |
||
kdump |
Optional configuration (enabled by default) Enables kdump to capture debug information when a kernel panic occurs.
|
||
CRI-O wipe disable |
Disables automatic wiping of the CRI-O image cache after unclean shutdown. |
||
SR-IOV-related kernel arguments |
Includes additional SR-IOV related arguments in the kernel command line. |
||
RCU Normal systemd service |
Sets |
||
One-shot time sync |
Runs a one-time NTP system time synchronization job for control plane or worker nodes. |
The following sections describe the various OKD components and configurations that you use to configure the hub cluster with Red Hat Advanced Cluster Management (RHACM).
No reference design updates in this release
Red Hat Advanced Cluster Management (RHACM) provides Multi Cluster Engine (MCE) installation and ongoing lifecycle management functionality for deployed clusters.
You manage cluster configuration and upgrades declaratively by applying Policy
custom resources (CRs) to clusters during maintenance windows.
You apply policies with the RHACM policy controller as managed by Topology Aware Lifecycle Manager (TALM). The policy controller handles configuration, upgrades, and cluster statuses.
When installing managed clusters, RHACM applies labels and initial ignition configuration to individual nodes in support of custom disk partitioning, allocation of roles, and allocation to machine config pools.
You define these configurations with SiteConfig
or ClusterInstance
CRs.
300 SiteConfig
CRs per ArgoCD application.
You can use multiple applications to achieve the maximum number of clusters supported by a single hub cluster.
A single hub cluster supports up to 3500 deployed single-node OpenShift clusters with 5 Policy
CRs bound to each cluster.
Use RHACM policy hub-side templating to better scale cluster configuration. You can significantly reduce the number of policies by using a single group policy or small number of general group policies where the group and per-cluster values are substituted into templates.
Cluster specific configuration: managed clusters typically have some number of configuration values that are specific to the individual cluster.
These configurations should be managed using RHACM policy hub-side templating with values pulled from ConfigMap
CRs based on the cluster name.
To save CPU resources on managed clusters, policies that apply static configurations should be unbound from managed clusters after GitOps ZTP installation of the cluster.
No reference design updates in this release
Topology Aware Lifecycle Manager (TALM) is an Operator that runs only on the hub cluster for managing how changes including cluster and Operator upgrades, configuration, and so on are rolled out to the network.
TALM supports concurrent cluster deployment in batches of 400.
Precaching and backup features are for single-node OpenShift clusters only.
Only policies that have the ran.openshift.io/ztp-deploy-wave
annotation are automatically applied by TALM during initial cluster installation.
You can create further ClusterGroupUpgrade
CRs to control the policies that TALM remediates.
No reference design updates in this release
GitOps and GitOps ZTP plugins provide a GitOps-based infrastructure for managing cluster deployment and configuration.
Cluster definitions and configurations are maintained as a declarative state in Git.
You can apply ClusterInstance
CRs to the hub cluster where the SiteConfig
Operator renders them as installation CRs.
Alternatively, you can use the GitOps ZTP plugin to generate installation CRs directly from SiteConfig
CRs.
The GitOps ZTP plugin supports automatic wrapping of configuration CRs in policies based on PolicyGenTemplate
CRs.
You can deploy and manage multiple versions of OKD on managed clusters using the baseline reference configuration CRs. You can use custom CRs alongside the baseline CRs. To maintain multiple per-version policies simultaneously, use Git to manage the versions of the source CRs and policy CRs ( Keep reference CRs and custom CRs under different directories. Doing this allows you to patch and update the reference CRs by simple replacement of all directory contents without touching the custom CRs. |
300 SiteConfig
CRs per ArgoCD application.
You can use multiple applications to achieve the maximum number of clusters supported by a single hub cluster.
Content in the /source-crs
folder in Git overrides content provided in the GitOps ZTP plugin container.
Git takes precedence in the search path.
Add the /source-crs
folder in the same directory as the kustomization.yaml
file, which includes the PolicyGenTemplate
as a generator.
Alternative locations for the |
The extraManifestPath
field of the SiteConfig
CR is deprecated from OKD 4.15 and later.
Use the new extraManifests.searchPaths
field instead.
For multi-node cluster upgrades, you can pause MachineConfigPool
(MCP
) CRs during maintenance windows by setting the paused
field to true
.
You can increase the number of nodes per MCP
updated simultaneously by configuring the maxUnavailable
setting in the MCP
CR.
The MaxUnavailable
field defines the percentage of nodes in the pool that can be simultaneously unavailable during a MachineConfig
update.
Set maxUnavailable
to the maximum tolerable value.
This reduces the number of reboots in a cluster during upgrades which results in shorter upgrade times.
When you finally unpause the MCP
CR, all the changed configurations are applied with a single reboot.
During cluster installation, you can pause custom MCP
CRs by setting the paused
field to true
and setting maxUnavailable
to 100% to improve installation times.
To avoid confusion or unintentional overwriting of files when updating content, use unique and distinguishable names for user-provided CRs in the /source-crs
folder and extra manifests in Git.
The SiteConfig
CR allows multiple extra-manifest paths. When files with the same name are found in multiple directory paths, the last file found takes precedence.
This allows you to put the full set of version-specific Day 0 manifests (extra-manifests) in Git and reference them from the SiteConfig
CR.
With this feature, you can deploy multiple OKD versions to managed clusters simultaneously.
No reference design updates in this release
Agent-based installer (ABI) provides installation capabilities without centralized infrastructure. The installation program creates an ISO image that you mount to the server. When the server boots it installs OKD and supplied extra manifests.
You can also use ABI to install OKD clusters without a hub cluster. An image registry is still required when you use ABI in this manner. |
Agent-based installer (ABI) is an optional component.
You can supply a limited set of additional manifests at installation time.
You must include MachineConfiguration
CRs that are required by the RAN DU use case.
ABI provides a baseline OKD installation.
You install Day 2 Operators and the remainder of the RAN DU use case configurations after installation.