$ oc get Tuned/default -o yaml -n openshift-cluster-node-tuning-operator
Learn about the Node Tuning Operator and how you can use it to manage node-level tuning by orchestrating the tuned daemon.
The Node Tuning Operator helps you manage node-level tuning by orchestrating the TuneD daemon and achieves low latency performance by using the Performance Profile controller. The majority of high-performance applications require some level of kernel tuning. The Node Tuning Operator provides a unified management interface to users of node-level sysctls and more flexibility to add custom tuning specified by user needs.
The Operator manages the containerized TuneD daemon for OKD as a Kubernetes daemon set. It ensures the custom tuning specification is passed to all containerized TuneD daemons running in the cluster in the format that the daemons understand. The daemons run on all nodes in the cluster, one per node.
Node-level settings applied by the containerized TuneD daemon are rolled back on an event that triggers a profile change or when the containerized TuneD daemon is terminated gracefully by receiving and handling a termination signal.
The Node Tuning Operator uses the Performance Profile controller to implement automatic tuning to achieve low latency performance for OKD applications. The cluster administrator configures a performance profile to define node-level settings such as the following:
Updating the kernel to kernel-rt.
Choosing CPUs for housekeeping.
Choosing CPUs for running workloads.
The Node Tuning Operator is part of a standard OKD installation in version 4.1 and later.
In earlier versions of OKD, the Performance Addon Operator was used to implement automatic tuning to achieve low latency performance for OpenShift applications. In OKD 4.11 and later, this functionality is part of the Node Tuning Operator.
Use this process to access an example Node Tuning Operator specification.
Run the following command to access an example Node Tuning Operator specification:
$ oc get Tuned/default -o yaml -n openshift-cluster-node-tuning-operator
The default CR is meant for delivering standard node-level tuning for the OKD platform and it can only be modified to set the Operator Management state. Any other custom changes to the default CR will be overwritten by the Operator. For custom tuning, create your own Tuned CRs. Newly created CRs will be combined with the default CR and custom tuning applied to OKD nodes based on node or pod labels and profile priorities.
While in certain situations the support for pod labels can be a convenient way of automatically delivering required tuning, this practice is discouraged and strongly advised against, especially in large-scale clusters. The default Tuned CR ships without pod label matching. If a custom profile is created with pod label matching, then the functionality will be enabled at that time. The pod label functionality will be deprecated in future versions of the Node Tuning Operator.
The custom resource (CR) for the Operator has two major sections. The first section,
profile:, is a list of TuneD profiles and their names. The second,
recommend:, defines the profile selection logic.
Multiple custom tuning specifications can co-exist as multiple CRs in the Operator’s namespace. The existence of new CRs or the deletion of old CRs is detected by the Operator. All existing custom tuning specifications are merged and appropriate objects for the containerized TuneD daemons are updated.
The Operator Management state is set by adjusting the default Tuned CR. By default, the Operator is in the Managed state and the
spec.managementState field is not present in the default Tuned CR. Valid values for the Operator Management state are as follows:
Managed: the Operator will update its operands as configuration resources are updated
Unmanaged: the Operator will ignore changes to the configuration resources
Removed: the Operator will remove its operands and resources the Operator provisioned
profile: section lists TuneD profiles and their names.
profile: - name: tuned_profile_1 data: | # TuneD profile specification [main] summary=Description of tuned_profile_1 profile [sysctl] net.ipv4.ip_forward=1 # ... other sysctl's or other TuneD daemon plugins supported by the containerized TuneD # ... - name: tuned_profile_n data: | # TuneD profile specification [main] summary=Description of tuned_profile_n profile # tuned_profile_n profile settings
profile: selection logic is defined by the
recommend: section of the CR. The
recommend: section is a list of items to recommend the profiles based on a selection criteria.
recommend: <recommend-item-1> # ... <recommend-item-n>
The individual items of the list:
- machineConfigLabels: (1) <mcLabels> (2) match: (3) <match> (4) priority: <priority> (5) profile: <tuned_profile_name> (6) operand: (7) debug: <bool> (8) tunedConfig: reapply_sysctl: <bool> (9)
|2||A dictionary of key/value
|3||If omitted, profile match is assumed unless a profile with a higher priority matches first or
|4||An optional list.|
|5||Profile ordering priority. Lower numbers mean higher priority (
|6||A TuneD profile to apply on a match. For example
|7||Optional operand configuration.|
|8||Turn debugging on or off for the TuneD daemon. Options are
<match> is an optional list recursively defined as follows:
- label: <label_name> (1) value: <label_value> (2) type: <label_type> (3) <match> (4)
|1||Node or pod label name.|
|2||Optional node or pod label value. If omitted, the presence of
|3||Optional object type (
<match> is not omitted, all nested
<match> sections must also evaluate to
false is assumed and the profile with the respective
<match> section will not be applied or recommended. Therefore, the nesting (child
<match> sections) works as logical AND operator. Conversely, if any item of the
<match> list matches, the entire
<match> list evaluates to
true. Therefore, the list acts as logical OR operator.
machineConfigLabels is defined, machine config pool based matching is turned on for the given
recommend: list item.
<mcLabels> specifies the labels for a machine config. The machine config is created automatically to apply host settings, such as kernel boot parameters, for the profile
<tuned_profile_name>. This involves finding all machine config pools with machine config selector matching
<mcLabels> and setting the profile
<tuned_profile_name> on all nodes that are assigned the found machine config pools. To target nodes that have both master and worker roles, you must use the master role.
The list items
machineConfigLabels are connected by the logical OR operator. The
match item is evaluated first in a short-circuit manner. Therefore, if it evaluates to
machineConfigLabels item is not considered.
When using machine config pool based matching, it is advised to group nodes with the same hardware configuration into the same machine config pool. Not following this practice might result in TuneD operands calculating conflicting kernel parameters for two or more nodes sharing the same machine config pool.
- match: - label: tuned.openshift.io/elasticsearch match: - label: node-role.kubernetes.io/master - label: node-role.kubernetes.io/infra type: pod priority: 10 profile: openshift-control-plane-es - match: - label: node-role.kubernetes.io/master - label: node-role.kubernetes.io/infra priority: 20 profile: openshift-control-plane - priority: 30 profile: openshift-node
The CR above is translated for the containerized TuneD daemon into its
recommend.conf file based on the profile priorities. The profile with the
highest priority (
openshift-control-plane-es and, therefore, it is considered first. The containerized TuneD daemon running on a given node looks to see if there is a pod running on the same node with the
tuned.openshift.io/elasticsearch label set. If not, the entire
<match> section evaluates as
false. If there is such a pod with the label, in order for the
<match> section to evaluate to
true, the node label also needs to be
If the labels for the profile with priority
openshift-control-plane-es profile is applied and no other profile is considered. If the node/pod label combination did not match, the second highest priority profile (
openshift-control-plane) is considered. This profile is applied if the containerized TuneD pod runs on a node with labels
Finally, the profile
openshift-node has the lowest priority of
30. It lacks the
<match> section and, therefore, will always match. It acts as a profile catch-all to set
openshift-node profile, if no other profile with higher priority matches on a given node.