$ mkdir -p ./out
You can use PolicyGenTemplate
CRs to deploy custom functionality in your managed clusters.
If you require cluster configuration changes outside of the base GitOps ZTP pipeline configuration, there are three options:
When the GitOps ZTP pipeline deployment is complete, the deployed cluster is ready for application workloads. At this point, you can install additional Operators and apply configurations specific to your requirements. Ensure that additional configurations do not negatively affect the performance of the platform or allocated CPU budget.
The base source custom resources (CRs) that you deploy with the GitOps ZTP pipeline can be augmented with custom content as required.
Extra manifests are applied during installation and make the installation process more efficient.
Providing additional source CRs or modifying existing source CRs can significantly impact the performance or CPU profile of OKD. |
PolicyGenTemplate
custom resources (CRs) allow you to overlay additional configuration details on top of the base source CRs provided with the GitOps plugin in the ztp-site-generate
container. You can think of PolicyGenTemplate
CRs as a logical merge or patch to the base CR. Use PolicyGenTemplate
CRs to update a single field of the base CR, or overlay the entire contents of the base CR. You can update values and insert fields that are not in the base CR.
The following example procedure describes how to update fields in the generated PerformanceProfile
CR for the reference configuration based on the PolicyGenTemplate
CR in the group-du-sno-ranGen.yaml
file. Use the procedure as a basis for modifying other parts of the PolicyGenTemplate
based on your requirements.
Create a Git repository where you manage your custom site configuration data. The repository must be accessible from the hub cluster and be defined as a source repository for Argo CD.
Review the baseline source CR for existing content. You can review the source CRs listed in the reference PolicyGenTemplate
CRs by extracting them from the zero touch provisioning (ZTP) container.
Create an /out
folder:
$ mkdir -p ./out
Extract the source CRs:
$ podman run --log-driver=none --rm registry.redhat.io/openshift4/ztp-site-generate-rhel8:v4.12.1 extract /home/ztp --tar | tar x -C ./out
Review the baseline PerformanceProfile
CR in ./out/source-crs/PerformanceProfile.yaml
:
apiVersion: performance.openshift.io/v2
kind: PerformanceProfile
metadata:
name: $name
annotations:
ran.openshift.io/ztp-deploy-wave: "10"
spec:
additionalKernelArgs:
- "idle=poll"
- "rcupdate.rcu_normal_after_boot=0"
cpu:
isolated: $isolated
reserved: $reserved
hugepages:
defaultHugepagesSize: $defaultHugepagesSize
pages:
- size: $size
count: $count
node: $node
machineConfigPoolSelector:
pools.operator.machineconfiguration.openshift.io/$mcp: ""
net:
userLevelNetworking: true
nodeSelector:
node-role.kubernetes.io/$mcp: ''
numa:
topologyPolicy: "restricted"
realTimeKernel:
enabled: true
Any fields in the source CR which contain |
Update the PolicyGenTemplate
entry for PerformanceProfile
in the group-du-sno-ranGen.yaml
reference file. The following example PolicyGenTemplate
CR stanza supplies appropriate CPU specifications, sets the hugepages
configuration, and adds a new field that sets globallyDisableIrqLoadBalancing
to false.
- fileName: PerformanceProfile.yaml
policyName: "config-policy"
metadata:
name: openshift-node-performance-profile
spec:
cpu:
# These must be tailored for the specific hardware platform
isolated: "2-19,22-39"
reserved: "0-1,20-21"
hugepages:
defaultHugepagesSize: 1G
pages:
- size: 1G
count: 10
globallyDisableIrqLoadBalancing: false
Commit the PolicyGenTemplate
change in Git, and then push to the Git repository being monitored by the GitOps ZTP argo CD application.
The ZTP application generates an RHACM policy that contains the generated PerformanceProfile
CR. The contents of that CR are derived by merging the metadata
and spec
contents from the PerformanceProfile
entry in the PolicyGenTemplate
onto the source CR. The resulting CR has the following content:
---
apiVersion: performance.openshift.io/v2
kind: PerformanceProfile
metadata:
name: openshift-node-performance-profile
spec:
additionalKernelArgs:
- idle=poll
- rcupdate.rcu_normal_after_boot=0
cpu:
isolated: 2-19,22-39
reserved: 0-1,20-21
globallyDisableIrqLoadBalancing: false
hugepages:
defaultHugepagesSize: 1G
pages:
- count: 10
size: 1G
machineConfigPoolSelector:
pools.operator.machineconfiguration.openshift.io/master: ""
net:
userLevelNetworking: true
nodeSelector:
node-role.kubernetes.io/master: ""
numa:
topologyPolicy: restricted
realTimeKernel:
enabled: true
In the An exception to this is the
The |
Perform the following procedure to add new content to the ZTP pipeline.
Create a subdirectory named source-crs
in the directory that contains the kustomization.yaml
file for the PolicyGenTemplate
custom resource (CR).
Add your custom CRs to the source-crs
subdirectory, as shown in the following example:
example
└── policygentemplates
├── dev.yaml
├── kustomization.yaml
├── mec-edge-sno1.yaml
├── sno.yaml
└── source-crs (1)
├── PaoCatalogSource.yaml
├── PaoSubscription.yaml
├── custom-crs
| ├── apiserver-config.yaml
| └── disable-nic-lldp.yaml
└── elasticsearch
├── ElasticsearchNS.yaml
└── ElasticsearchOperatorGroup.yaml
1 | The source-crs subdirectory must be in the same directory as the kustomization.yaml file. |
To use your own resources, ensure that the custom CR names differ from the default source CRs provided in the ZTP container. |
Update the required PolicyGenTemplate
CRs to include references to the content you added in the source-crs/custom-crs
and source-crs/elasticsearch
directories. For example:
apiVersion: ran.openshift.io/v1
kind: PolicyGenTemplate
metadata:
name: "group-dev"
namespace: "ztp-clusters"
spec:
bindingRules:
dev: "true"
mcp: "master"
sourceFiles:
# These policies/CRs come from the internal container Image
#Cluster Logging
- fileName: ClusterLogNS.yaml
remediationAction: inform
policyName: "group-dev-cluster-log-ns"
- fileName: ClusterLogOperGroup.yaml
remediationAction: inform
policyName: "group-dev-cluster-log-operator-group"
- fileName: ClusterLogSubscription.yaml
remediationAction: inform
policyName: "group-dev-cluster-log-sub"
#Local Storage Operator
- fileName: StorageNS.yaml
remediationAction: inform
policyName: "group-dev-lso-ns"
- fileName: StorageOperGroup.yaml
remediationAction: inform
policyName: "group-dev-lso-operator-group"
- fileName: StorageSubscription.yaml
remediationAction: inform
policyName: "group-dev-lso-sub"
#These are custom local polices that come from the source-crs directory in the git repo
# Performance Addon Operator
- fileName: PaoSubscriptionNS.yaml
remediationAction: inform
policyName: "group-dev-pao-ns"
- fileName: PaoSubscriptionCatalogSource.yaml
remediationAction: inform
policyName: "group-dev-pao-cat-source"
spec:
image: <image_URL_here>
- fileName: PaoSubscription.yaml
remediationAction: inform
policyName: "group-dev-pao-sub"
#Elasticsearch Operator
- fileName: elasticsearch/ElasticsearchNS.yaml (1)
remediationAction: inform
policyName: "group-dev-elasticsearch-ns"
- fileName: elasticsearch/ElasticsearchOperatorGroup.yaml
remediationAction: inform
policyName: "group-dev-elasticsearch-operator-group"
#Custom Resources
- fileName: custom-crs/apiserver-config.yaml (1)
remediationAction: inform
policyName: "group-dev-apiserver-config"
- fileName: custom-crs/disable-nic-lldp.yaml
remediationAction: inform
policyName: "group-dev-disable-nic-lldp"
1 | Set the fileName field to include the relative path to the file from the /source-crs parent directory. |
Commit the PolicyGenTemplate
change in Git, and then push to the Git repository that is monitored by the GitOps ZTP Argo CD policies application.
Update the ClusterGroupUpgrade
CR to include the changed PolicyGenTemplate
and save it as cgu-test.yaml
. The following example shows a generated cgu-test.yaml
file.
apiVersion: ran.openshift.io/v1alpha1
kind: ClusterGroupUpgrade
metadata:
name: custom-source-cr
namespace: ztp-clusters
spec:
managedPolicies:
- group-dev-config-policy
enable: true
clusters:
- cluster1
remediationStrategy:
maxConcurrency: 2
timeout: 240
Apply the updated ClusterGroupUpgrade
CR by running the following command:
$ oc apply -f cgu-test.yaml
Check that the updates have succeeded by running the following command:
$ oc get cgu -A
NAMESPACE NAME AGE STATE DETAILS
ztp-clusters custom-source-cr 6s InProgress Remediating non-compliant policies
ztp-install cluster1 19h Completed All clusters are compliant with all the managed policies
Use Red Hat Advanced Cluster Management (RHACM) installed on a hub cluster to monitor and report on whether your managed clusters are compliant with applied policies. RHACM uses policy templates to apply predefined policy controllers and policies. Policy controllers are Kubernetes custom resource definition (CRD) instances.
You can override the default policy evaluation intervals with PolicyGenTemplate
custom resources (CRs). You configure duration settings that define how long a ConfigurationPolicy
CR can be in a state of policy compliance or non-compliance before RHACM re-evaluates the applied cluster policies.
The zero touch provisioning (ZTP) policy generator generates ConfigurationPolicy
CR policies with pre-defined policy evaluation intervals. The default value for the noncompliant
state is 10 seconds. The default value for the compliant
state is 10 minutes. To disable the evaluation interval, set the value to never
.
You have installed the OpenShift CLI (oc
).
You have logged in to the hub cluster as a user with cluster-admin
privileges.
You have created a Git repository where you manage your custom site configuration data.
To configure the evaluation interval for all policies in a PolicyGenTemplate
CR, add evaluationInterval
to the spec
field, and then set the appropriate compliant
and noncompliant
values. For example:
spec:
evaluationInterval:
compliant: 30m
noncompliant: 20s
To configure the evaluation interval for the spec.sourceFiles
object in a PolicyGenTemplate
CR, add evaluationInterval
to the sourceFiles
field, for example:
spec:
sourceFiles:
- fileName: SriovSubscription.yaml
policyName: "sriov-sub-policy"
evaluationInterval:
compliant: never
noncompliant: 10s
Commit the PolicyGenTemplate
CRs files in the Git repository and push your changes.
Check that the managed spoke cluster policies are monitored at the expected intervals.
Log in as a user with cluster-admin
privileges on the managed cluster.
Get the pods that are running in the open-cluster-management-agent-addon
namespace. Run the following command:
$ oc get pods -n open-cluster-management-agent-addon
NAME READY STATUS RESTARTS AGE
config-policy-controller-858b894c68-v4xdb 1/1 Running 22 (5d8h ago) 10d
Check the applied policies are being evaluated at the expected interval in the logs for the config-policy-controller
pod:
$ oc logs -n open-cluster-management-agent-addon config-policy-controller-858b894c68-v4xdb
2022-05-10T15:10:25.280Z info configuration-policy-controller controllers/configurationpolicy_controller.go:166 Skipping the policy evaluation due to the policy not reaching the evaluation interval {"policy": "compute-1-config-policy-config"}
2022-05-10T15:10:25.280Z info configuration-policy-controller controllers/configurationpolicy_controller.go:166 Skipping the policy evaluation due to the policy not reaching the evaluation interval {"policy": "compute-1-common-compute-1-catalog-policy-config"}
Create a validator inform policy that signals when the zero touch provisioning (ZTP) installation and configuration of the deployed cluster is complete. This policy can be used for deployments of single-node OpenShift clusters, three-node clusters, and standard clusters.
Create a standalone PolicyGenTemplate
custom resource (CR) that contains the source file
validatorCRs/informDuValidator.yaml
. You only need one standalone PolicyGenTemplate
CR for each cluster type. For example, this CR applies a validator inform policy for single-node OpenShift clusters:
apiVersion: ran.openshift.io/v1
kind: PolicyGenTemplate
metadata:
name: "group-du-sno-validator" (1)
namespace: "ztp-group" (2)
spec:
bindingRules:
group-du-sno: "" (3)
bindingExcludedRules:
ztp-done: "" (4)
mcp: "master" (5)
sourceFiles:
- fileName: validatorCRs/informDuValidator.yaml
remediationAction: inform (6)
policyName: "du-policy" (7)
1 | The name of PolicyGenTemplates object. This name is also used as part of the names
for the placementBinding , placementRule , and policy that are created in the requested namespace . |
2 | This value should match the namespace used in the group PolicyGenTemplates . |
3 | The group-du-* label defined in bindingRules must exist in the SiteConfig files. |
4 | The label defined in bindingExcludedRules must be`ztp-done:`. The ztp-done label is used in coordination with the Topology Aware Lifecycle Manager. |
5 | mcp defines the MachineConfigPool object that is used in the source file validatorCRs/informDuValidator.yaml . It should be master for single node and three-node cluster deployments and worker for standard cluster deployments. |
6 | Optional. The default value is inform . |
7 | This value is used as part of the name for the generated RHACM policy. The generated validator policy for the single node example is group-du-sno-validator-du-policy . |
Commit the PolicyGenTemplate
CR file in your Git repository and push the changes.
You can use the GitOps ZTP pipeline to configure PTP events that use HTTP or AMQP transport.
HTTP transport is the default transport for PTP and bare-metal events. Use HTTP transport instead of AMQP for PTP and bare-metal events where possible. AMQ Interconnect is EOL from 30 June 2024. Extended life cycle support (ELS) for AMQ Interconnect ends 29 November 2029. For more information see, Red Hat AMQ Interconnect support status. |
You can configure PTP events that use HTTP transport on managed clusters that you deploy with the GitOps Zero Touch Provisioning (ZTP) pipeline.
You have installed the OpenShift CLI (oc
).
You have logged in as a user with cluster-admin
privileges.
You have created a Git repository where you manage your custom site configuration data.
Apply the following PolicyGenTemplate
changes to group-du-3node-ranGen.yaml
, group-du-sno-ranGen.yaml
, or group-du-standard-ranGen.yaml
files according to your requirements:
In .sourceFiles
, add the PtpOperatorConfig
CR file that configures the transport host:
- fileName: PtpOperatorConfigForEvent.yaml
policyName: "config-policy"
spec:
daemonNodeSelector: {}
ptpEventConfig:
enableEventPublisher: true
transportHost: http://ptp-event-publisher-service-NODE_NAME.openshift-ptp.svc.cluster.local:9043
In OKD 4.12 or later, you do not need to set the |
Configure the linuxptp
and phc2sys
for the PTP clock type and interface. For example, add the following stanza into .sourceFiles
:
- fileName: PtpConfigSlave.yaml (1)
policyName: "config-policy"
metadata:
name: "du-ptp-slave"
spec:
profile:
- name: "slave"
interface: "ens5f1" (2)
ptp4lOpts: "-2 -s --summary_interval -4" (3)
phc2sysOpts: "-a -r -m -n 24 -N 8 -R 16" (4)
ptpClockThreshold: (5)
holdOverTimeout: 30 #secs
maxOffsetThreshold: 100 #nano secs
minOffsetThreshold: -100 #nano secs
1 | Can be one of PtpConfigMaster.yaml , PtpConfigSlave.yaml , or PtpConfigSlaveCvl.yaml depending on your requirements. PtpConfigSlaveCvl.yaml configures linuxptp services for an Intel E810 Columbiaville NIC. For configurations based on group-du-sno-ranGen.yaml or group-du-3node-ranGen.yaml , use PtpConfigSlave.yaml . |
2 | Device specific interface name. |
3 | You must append the --summary_interval -4 value to ptp4lOpts in .spec.sourceFiles.spec.profile to enable PTP fast events. |
4 | Required phc2sysOpts values. -m prints messages to stdout . The linuxptp-daemon DaemonSet parses the logs and generates Prometheus metrics. |
5 | Optional. If the ptpClockThreshold stanza is not present, default values are used for the ptpClockThreshold fields. The stanza shows default ptpClockThreshold values. The ptpClockThreshold values configure how long after the PTP master clock is disconnected before PTP events are triggered. holdOverTimeout is the time value in seconds before the PTP clock event state changes to FREERUN when the PTP master clock is disconnected. The maxOffsetThreshold and minOffsetThreshold settings configure offset values in nanoseconds that compare against the values for CLOCK_REALTIME (phc2sys ) or master offset (ptp4l ). When the ptp4l or phc2sys offset value is outside this range, the PTP clock state is set to FREERUN . When the offset value is within this range, the PTP clock state is set to LOCKED . |
Merge any other required changes and files with your custom site repository.
Push the changes to your site configuration repository to deploy PTP fast events to new sites using GitOps ZTP.
You can configure PTP events that use AMQP transport on managed clusters that you deploy with the GitOps Zero Touch Provisioning (ZTP) pipeline.
HTTP transport is the default transport for PTP and bare-metal events. Use HTTP transport instead of AMQP for PTP and bare-metal events where possible. AMQ Interconnect is EOL from 30 June 2024. Extended life cycle support (ELS) for AMQ Interconnect ends 29 November 2029. For more information see, Red Hat AMQ Interconnect support status. |
You have installed the OpenShift CLI (oc
).
You have logged in as a user with cluster-admin
privileges.
You have created a Git repository where you manage your custom site configuration data.
Add the following YAML into .spec.sourceFiles
in the common-ranGen.yaml
file to configure the AMQP Operator:
#AMQ interconnect operator for fast events
- fileName: AmqSubscriptionNS.yaml
policyName: "subscriptions-policy"
- fileName: AmqSubscriptionOperGroup.yaml
policyName: "subscriptions-policy"
- fileName: AmqSubscription.yaml
policyName: "subscriptions-policy"
Apply the following PolicyGenTemplate
changes to group-du-3node-ranGen.yaml
, group-du-sno-ranGen.yaml
, or group-du-standard-ranGen.yaml
files according to your requirements:
In .sourceFiles
, add the PtpOperatorConfig
CR file that configures the AMQ transport host to the config-policy
:
- fileName: PtpOperatorConfigForEvent.yaml
policyName: "config-policy"
spec:
daemonNodeSelector: {}
ptpEventConfig:
enableEventPublisher: true
transportHost: "amqp://amq-router.amq-router.svc.cluster.local"
Configure the linuxptp
and phc2sys
for the PTP clock type and interface. For example, add the following stanza into .sourceFiles
:
- fileName: PtpConfigSlave.yaml (1)
policyName: "config-policy"
metadata:
name: "du-ptp-slave"
spec:
profile:
- name: "slave"
interface: "ens5f1" (2)
ptp4lOpts: "-2 -s --summary_interval -4" (3)
phc2sysOpts: "-a -r -m -n 24 -N 8 -R 16" (4)
ptpClockThreshold: (5)
holdOverTimeout: 30 #secs
maxOffsetThreshold: 100 #nano secs
minOffsetThreshold: -100 #nano secs
1 | Can be one PtpConfigMaster.yaml , PtpConfigSlave.yaml , or PtpConfigSlaveCvl.yaml depending on your requirements. PtpConfigSlaveCvl.yaml configures linuxptp services for an Intel E810 Columbiaville NIC. For configurations based on group-du-sno-ranGen.yaml or group-du-3node-ranGen.yaml , use PtpConfigSlave.yaml . |
2 | Device specific interface name. |
3 | You must append the --summary_interval -4 value to ptp4lOpts in .spec.sourceFiles.spec.profile to enable PTP fast events. |
4 | Required phc2sysOpts values. -m prints messages to stdout . The linuxptp-daemon DaemonSet parses the logs and generates Prometheus metrics. |
5 | Optional. If the ptpClockThreshold stanza is not present, default values are used for the ptpClockThreshold fields. The stanza shows default ptpClockThreshold values. The ptpClockThreshold values configure how long after the PTP master clock is disconnected before PTP events are triggered. holdOverTimeout is the time value in seconds before the PTP clock event state changes to FREERUN when the PTP master clock is disconnected. The maxOffsetThreshold and minOffsetThreshold settings configure offset values in nanoseconds that compare against the values for CLOCK_REALTIME (phc2sys ) or master offset (ptp4l ). When the ptp4l or phc2sys offset value is outside this range, the PTP clock state is set to FREERUN . When the offset value is within this range, the PTP clock state is set to LOCKED . |
Apply the following PolicyGenTemplate
changes to your specific site YAML files, for example, example-sno-site.yaml
:
In .sourceFiles
, add the Interconnect
CR file that configures the AMQ router to the config-policy
:
- fileName: AmqInstance.yaml
policyName: "config-policy"
Merge any other required changes and files with your custom site repository.
Push the changes to your site configuration repository to deploy PTP fast events to new sites using GitOps ZTP.
You can use the GitOps ZTP pipeline to configure bare-metal events that use HTTP or AMQP transport.
HTTP transport is the default transport for PTP and bare-metal events. Use HTTP transport instead of AMQP for PTP and bare-metal events where possible. AMQ Interconnect is EOL from 30 June 2024. Extended life cycle support (ELS) for AMQ Interconnect ends 29 November 2029. For more information see, Red Hat AMQ Interconnect support status. |
You can configure bare-metal events that use HTTP transport on managed clusters that you deploy with the GitOps Zero Touch Provisioning (ZTP) pipeline.
You have installed the OpenShift CLI (oc
).
You have logged in as a user with cluster-admin
privileges.
You have created a Git repository where you manage your custom site configuration data.
Configure the Bare Metal Event Relay Operator by adding the following YAML to spec.sourceFiles
in the common-ranGen.yaml
file:
# Bare Metal Event Relay operator
- fileName: BareMetalEventRelaySubscriptionNS.yaml
policyName: "subscriptions-policy"
- fileName: BareMetalEventRelaySubscriptionOperGroup.yaml
policyName: "subscriptions-policy"
- fileName: BareMetalEventRelaySubscription.yaml
policyName: "subscriptions-policy"
Add the HardwareEvent
CR to spec.sourceFiles
in your specific group configuration file, for example, in the group-du-sno-ranGen.yaml
file:
- fileName: HardwareEvent.yaml (1)
policyName: "config-policy"
spec:
nodeSelector: {}
transportHost: "http://hw-event-publisher-service.openshift-bare-metal-events.svc.cluster.local:9043"
logLevel: "info"
1 | Each baseboard management controller (BMC) requires a single HardwareEvent CR only. |
In OKD 4.12 or later, you do not need to set the |
Merge any other required changes and files with your custom site repository.
Push the changes to your site configuration repository to deploy bare-metal events to new sites with GitOps ZTP.
Create the Redfish Secret by running the following command:
$ oc -n openshift-bare-metal-events create secret generic redfish-basic-auth \
--from-literal=username=<bmc_username> --from-literal=password=<bmc_password> \
--from-literal=hostaddr="<bmc_host_ip_addr>"
You can configure bare-metal events that use AMQP transport on managed clusters that you deploy with the GitOps Zero Touch Provisioning (ZTP) pipeline.
HTTP transport is the default transport for PTP and bare-metal events. Use HTTP transport instead of AMQP for PTP and bare-metal events where possible. AMQ Interconnect is EOL from 30 June 2024. Extended life cycle support (ELS) for AMQ Interconnect ends 29 November 2029. For more information see, Red Hat AMQ Interconnect support status. |
You have installed the OpenShift CLI (oc
).
You have logged in as a user with cluster-admin
privileges.
You have created a Git repository where you manage your custom site configuration data.
To configure the AMQ Interconnect Operator and the Bare Metal Event Relay Operator, add the following YAML to spec.sourceFiles
in the common-ranGen.yaml
file:
# AMQ interconnect operator for fast events
- fileName: AmqSubscriptionNS.yaml
policyName: "subscriptions-policy"
- fileName: AmqSubscriptionOperGroup.yaml
policyName: "subscriptions-policy"
- fileName: AmqSubscription.yaml
policyName: "subscriptions-policy"
# Bare Metal Event Rely operator
- fileName: BareMetalEventRelaySubscriptionNS.yaml
policyName: "subscriptions-policy"
- fileName: BareMetalEventRelaySubscriptionOperGroup.yaml
policyName: "subscriptions-policy"
- fileName: BareMetalEventRelaySubscription.yaml
policyName: "subscriptions-policy"
Add the Interconnect
CR to .spec.sourceFiles
in the site configuration file, for example, the example-sno-site.yaml
file:
- fileName: AmqInstance.yaml
policyName: "config-policy"
Add the HardwareEvent
CR to spec.sourceFiles
in your specific group configuration file, for example, in the group-du-sno-ranGen.yaml
file:
- fileName: HardwareEvent.yaml
policyName: "config-policy"
spec:
nodeSelector: {}
transportHost: "amqp://<amq_interconnect_name>.<amq_interconnect_namespace>.svc.cluster.local" (1)
logLevel: "info"
1 | The transportHost URL is composed of the existing AMQ Interconnect CR name and namespace . For example, in transportHost: "amqp://amq-router.amq-router.svc.cluster.local" , the AMQ Interconnect name and namespace are both set to amq-router . |
Each baseboard management controller (BMC) requires a single |
Commit the PolicyGenTemplate
change in Git, and then push the changes to your site configuration repository to deploy bare-metal events monitoring to new sites using GitOps ZTP.
Create the Redfish Secret by running the following command:
$ oc -n openshift-bare-metal-events create secret generic redfish-basic-auth \
--from-literal=username=<bmc_username> --from-literal=password=<bmc_password> \
--from-literal=hostaddr="<bmc_host_ip_addr>"
OKD manages image caching using a local registry. In edge computing use cases, clusters are often subject to bandwidth restrictions when communicating with centralized image registries, which might result in long image download times.
Long download times are unavoidable during initial deployment. Over time, there is a risk that CRI-O will erase the /var/lib/containers/storage
directory in the case of an unexpected shutdown.
To address long image download times, you can create a local image registry on remote managed clusters using GitOps ZTP. This is useful in Edge computing scenarios where clusters are deployed at the far edge of the network.
Before you can set up the local image registry with GitOps ZTP, you need to configure disk partitioning in the SiteConfig
CR that you use to install the remote managed cluster. After installation, you configure the local image registry using a PolicyGenTemplate
CR. Then, the ZTP pipeline creates Persistent Volume (PV) and Persistent Volume Claim (PVC) CRs and patches the imageregistry
configuration.
The local image registry can only be used for user application images and cannot be used for the OKD or Operator Lifecycle Manager operator images. |
Configure disk partitioning for a managed cluster using a SiteConfig
CR and GitOps Zero Touch Provisioning (ZTP). The disk partition details in the SiteConfig
CR must match the underlying disk.
You must complete this procedure at installation time. |
Install Butane.
Create the storage.bu
file by using the following example YAML file:
variant: fcos
version: 1.3.0
storage:
disks:
- device: /dev/disk/by-path/pci-0000:01:00.0-scsi-0:2:0:0 (1)
wipe_table: false
partitions:
- label: var-lib-containers
start_mib: <start_of_partition> (2)
size_mib: <partition_size> (3)
filesystems:
- path: /var/lib/containers
device: /dev/disk/by-partlabel/var-lib-containers
format: xfs
wipe_filesystem: true
with_mount_unit: true
mount_options:
- defaults
- prjquota
1 | Specify the root disk. |
2 | Specify the start of the partition in MiB. If the value is too small, the installation fails. |
3 | Specify the size of the partition. If the value is too small, the deployments fails. |
Convert the storage.bu
file to an Ignition file by running the following command:
$ butane storage.bu
{"ignition":{"version":"3.2.0"},"storage":{"disks":[{"device":"/dev/disk/by-path/pci-0000:01:00.0-scsi-0:2:0:0","partitions":[{"label":"var-lib-containers","sizeMiB":0,"startMiB":250000}],"wipeTable":false}],"filesystems":[{"device":"/dev/disk/by-partlabel/var-lib-containers","format":"xfs","mountOptions":["defaults","prjquota"],"path":"/var/lib/containers","wipeFilesystem":true}]},"systemd":{"units":[{"contents":"# # Generated by Butane\n[Unit]\nRequires=systemd-fsck@dev-disk-by\\x2dpartlabel-var\\x2dlib\\x2dcontainers.service\nAfter=systemd-fsck@dev-disk-by\\x2dpartlabel-var\\x2dlib\\x2dcontainers.service\n\n[Mount]\nWhere=/var/lib/containers\nWhat=/dev/disk/by-partlabel/var-lib-containers\nType=xfs\nOptions=defaults,prjquota\n\n[Install]\nRequiredBy=local-fs.target","enabled":true,"name":"var-lib-containers.mount"}]}}
Use a tool such as JSON Pretty Print to convert the output into JSON format.
Copy the output into the .spec.clusters.nodes.ignitionConfigOverride
field in the SiteConfig
CR:
[...]
spec:
clusters:
- nodes:
- ignitionConfigOverride: |
{
"ignition": {
"version": "3.2.0"
},
"storage": {
"disks": [
{
"device": "/dev/disk/by-path/pci-0000:01:00.0-scsi-0:2:0:0",
"partitions": [
{
"label": "var-lib-containers",
"sizeMiB": 0,
"startMiB": 250000
}
],
"wipeTable": false
}
],
"filesystems": [
{
"device": "/dev/disk/by-partlabel/var-lib-containers",
"format": "xfs",
"mountOptions": [
"defaults",
"prjquota"
],
"path": "/var/lib/containers",
"wipeFilesystem": true
}
]
},
"systemd": {
"units": [
{
"contents": "# # Generated by Butane\n[Unit]\nRequires=systemd-fsck@dev-disk-by\\x2dpartlabel-var\\x2dlib\\x2dcontainers.service\nAfter=systemd-fsck@dev-disk-by\\x2dpartlabel-var\\x2dlib\\x2dcontainers.service\n\n[Mount]\nWhere=/var/lib/containers\nWhat=/dev/disk/by-partlabel/var-lib-containers\nType=xfs\nOptions=defaults,prjquota\n\n[Install]\nRequiredBy=local-fs.target",
"enabled": true,
"name": "var-lib-containers.mount"
}
]
}
}
[...]
If the |
During or after installation, verify on the hub cluster that the BareMetalHost
object shows the annotation by running the following command:
$ oc get bmh -n my-sno-ns my-sno -ojson | jq '.metadata.annotations["bmac.agent-install.openshift.io/ignition-config-overrides"]
"{\"ignition\":{\"version\":\"3.2.0\"},\"storage\":{\"disks\":[{\"device\":\"/dev/disk/by-id/wwn-0x6b07b250ebb9d0002a33509f24af1f62\",\"partitions\":[{\"label\":\"var-lib-containers\",\"sizeMiB\":0,\"startMiB\":250000}],\"wipeTable\":false}],\"filesystems\":[{\"device\":\"/dev/disk/by-partlabel/var-lib-containers\",\"format\":\"xfs\",\"mountOptions\":[\"defaults\",\"prjquota\"],\"path\":\"/var/lib/containers\",\"wipeFilesystem\":true}]},\"systemd\":{\"units\":[{\"contents\":\"# Generated by Butane\\n[Unit]\\nRequires=systemd-fsck@dev-disk-by\\\\x2dpartlabel-var\\\\x2dlib\\\\x2dcontainers.service\\nAfter=systemd-fsck@dev-disk-by\\\\x2dpartlabel-var\\\\x2dlib\\\\x2dcontainers.service\\n\\n[Mount]\\nWhere=/var/lib/containers\\nWhat=/dev/disk/by-partlabel/var-lib-containers\\nType=xfs\\nOptions=defaults,prjquota\\n\\n[Install]\\nRequiredBy=local-fs.target\",\"enabled\":true,\"name\":\"var-lib-containers.mount\"}]}}"
After installation, check the single-node OpenShift disk status:
Enter into a debug session on the single-node OpenShift node by running the following command.
This step instantiates a debug pod called <node_name>-debug
:
$ oc debug node/my-sno-node
Set /host
as the root directory within the debug shell by running the following command.
The debug pod mounts the host’s root file system in /host
within the pod. By changing the root directory to /host
, you can run binaries contained in the host’s executable paths:
# chroot /host
List information about all available block devices by running the following command:
# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS
sda 8:0 0 446.6G 0 disk
├─sda1 8:1 0 1M 0 part
├─sda2 8:2 0 127M 0 part
├─sda3 8:3 0 384M 0 part /boot
├─sda4 8:4 0 243.6G 0 part /var
│ /sysroot/ostree/deploy/rhcos/var
│ /usr
│ /etc
│ /
│ /sysroot
└─sda5 8:5 0 202.5G 0 part /var/lib/containers
Display information about the file system disk space usage by running the following command:
# df -h
Filesystem Size Used Avail Use% Mounted on
devtmpfs 4.0M 0 4.0M 0% /dev
tmpfs 126G 84K 126G 1% /dev/shm
tmpfs 51G 93M 51G 1% /run
/dev/sda4 244G 5.2G 239G 3% /sysroot
tmpfs 126G 4.0K 126G 1% /tmp
/dev/sda5 203G 119G 85G 59% /var/lib/containers
/dev/sda3 350M 110M 218M 34% /boot
tmpfs 26G 0 26G 0% /run/user/1000
Use PolicyGenTemplate
(PGT) CRs to apply the CRs required to configure the image registry and patch the imageregistry
configuration.
You have configured a disk partition in the managed cluster.
You have installed the OpenShift CLI (oc
).
You have logged in to the hub cluster as a user with cluster-admin
privileges.
You have created a Git repository where you manage your custom site configuration data for use with GitOps Zero Touch Provisioning (ZTP).
Configure the storage class, persistent volume claim, persistent volume, and image registry configuration in the appropriate PolicyGenTemplate
CR. For example, to configure an individual site, add the following YAML to the file example-sno-site.yaml
:
sourceFiles:
# storage class
- fileName: StorageClass.yaml
policyName: "sc-for-image-registry"
metadata:
name: image-registry-sc
annotations:
ran.openshift.io/ztp-deploy-wave: "100" (1)
# persistent volume claim
- fileName: StoragePVC.yaml
policyName: "pvc-for-image-registry"
metadata:
name: image-registry-pvc
namespace: openshift-image-registry
annotations:
ran.openshift.io/ztp-deploy-wave: "100"
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 100Gi
storageClassName: image-registry-sc
volumeMode: Filesystem
# persistent volume
- fileName: ImageRegistryPV.yaml (2)
policyName: "pv-for-image-registry"
metadata:
annotations:
ran.openshift.io/ztp-deploy-wave: "100"
- fileName: ImageRegistryConfig.yaml
policyName: "config-for-image-registry"
complianceType: musthave
metadata:
annotations:
ran.openshift.io/ztp-deploy-wave: "100"
spec:
storage:
pvc:
claim: "image-registry-pvc"
1 | Set the appropriate value for ztp-deploy-wave depending on whether you are configuring image registries at the site, common, or group level. ztp-deploy-wave: "100" is suitable for development or testing because it allows you to group the referenced source files together. |
2 | In ImageRegistryPV.yaml , ensure that the spec.local.path field is set to /var/imageregistry to match the value set for the mount_point field in the SiteConfig CR. |
Do not set |
Commit the PolicyGenTemplate
change in Git, and then push to the Git repository being monitored by the GitOps ZTP ArgoCD application.
Use the following steps to troubleshoot errors with the local image registry on the managed clusters:
Verify successful login to the registry while logged in to the managed cluster. Run the following commands:
Export the managed cluster name:
$ cluster=<managed_cluster_name>
Get the managed cluster kubeconfig
details:
$ oc get secret -n $cluster $cluster-admin-password -o jsonpath='{.data.password}' | base64 -d > kubeadmin-password-$cluster
Download and export the cluster kubeconfig
:
$ oc get secret -n $cluster $cluster-admin-kubeconfig -o jsonpath='{.data.kubeconfig}' | base64 -d > kubeconfig-$cluster && export KUBECONFIG=./kubeconfig-$cluster
Verify access to the image registry from the managed cluster. See "Accessing the registry".
Check that the Config
CRD in the imageregistry.operator.openshift.io
group instance is not reporting errors. Run the following command while logged in to the managed cluster:
$ oc get image.config.openshift.io cluster -o yaml
apiVersion: config.openshift.io/v1
kind: Image
metadata:
annotations:
include.release.openshift.io/ibm-cloud-managed: "true"
include.release.openshift.io/self-managed-high-availability: "true"
include.release.openshift.io/single-node-developer: "true"
release.openshift.io/create-only: "true"
creationTimestamp: "2021-10-08T19:02:39Z"
generation: 5
name: cluster
resourceVersion: "688678648"
uid: 0406521b-39c0-4cda-ba75-873697da75a4
spec:
additionalTrustedCA:
name: acm-ice
Check that the PersistentVolumeClaim
on the managed cluster is populated with data. Run the following command while logged in to the managed cluster:
$ oc get pv image-registry-sc
Check that the registry*
pod is running and is located under the openshift-image-registry
namespace.
$ oc get pods -n openshift-image-registry | grep registry*
cluster-image-registry-operator-68f5c9c589-42cfg 1/1 Running 0 8d
image-registry-5f8987879-6nx6h 1/1 Running 0 8d
Check that the disk partition on the managed cluster is correct:
Open a debug shell to the managed cluster:
$ oc debug node/sno-1.example.com
Run lsblk
to check the host disk partitions:
sh-4.4# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 446.6G 0 disk
|-sda1 8:1 0 1M 0 part
|-sda2 8:2 0 127M 0 part
|-sda3 8:3 0 384M 0 part /boot
|-sda4 8:4 0 336.3G 0 part /sysroot
`-sda5 8:5 0 100.1G 0 part /var/imageregistry (1)
sdb 8:16 0 446.6G 0 disk
sr0 11:0 1 104M 0 rom
1 | /var/imageregistry indicates that the disk is correctly partitioned. |
Topology Aware Lifecycle Manager supports partial Red Hat Advanced Cluster Management (RHACM) hub cluster template functions in configuration policies used with GitOps ZTP.
Hub-side cluster templates allow you to define configuration policies that can be dynamically customized to the target clusters. This reduces the need to create separate policies for many clusters with similiar configurations but with different values.
Policy templates are restricted to the same namespace as the namespace where the policy is defined. This means that you must create the objects referenced in the hub template in the same namespace where the policy is created. |
The following supported hub template functions are available for use in GitOps ZTP with TALM:
fromConfigmap
returns the value of the provided data key in the named ConfigMap
resource.
There is a 1 MiB size limit for
|
base64enc
returns the base64-encoded value of the input string
base64dec
returns the decoded value of the base64-encoded input string
indent
returns the input string with added indent spaces
autoindent
returns the input string with added indent spaces based on the spacing used in the parent template
toInt
casts and returns the integer value of the input value
toBool
converts the input string into a boolean value, and returns the boolean
Various Open source community functions are also available for use with GitOps ZTP.
The following code examples are valid hub templates. Each of these templates return values from the ConfigMap
CR with the name test-config
in the default
namespace.
Returns the value with the key common-key
:
{{hub fromConfigMap "default" "test-config" "common-key" hub}}
Returns a string by using the concatenated value of the .ManagedClusterName
field and the string -name
:
{{hub fromConfigMap "default" "test-config" (printf "%s-name" .ManagedClusterName) hub}}
Casts and returns a boolean value from the concatenated value of the .ManagedClusterName
field and the string -name
:
{{hub fromConfigMap "default" "test-config" (printf "%s-name" .ManagedClusterName) | toBool hub}}
Casts and returns an integer value from the concatenated value of the .ManagedClusterName
field and the string -name
:
{{hub (printf "%s-name" .ManagedClusterName) | fromConfigMap "default" "test-config" | toInt hub}}
You can manage host NICs in a single ConfigMap
CR and use hub cluster templates to populate the custom NIC values in the generated polices that get applied to the cluster hosts.
Using hub cluster templates in site PolicyGenTemplate
(PGT) CRs means that you do not need to create multiple single site PGT CRs for each site.
The following example shows you how to use a single ConfigMap
CR to manage cluster host NICs and apply them to the cluster as polices by using a single PolicyGenTemplate
site CR.
When you use the |
You have installed the OpenShift CLI (oc
).
You have logged in to the hub cluster as a user with cluster-admin
privileges.
You have created a Git repository where you manage your custom site configuration data. The repository must be accessible from the hub cluster and be defined as a source repository for the GitOps ZTP ArgoCD application.
Create a ConfigMap
resource that describes the NICs for a group of hosts. For example:
apiVersion: v1
kind: ConfigMap
metadata:
name: sriovdata
namespace: ztp-site
annotations:
argocd.argoproj.io/sync-options: Replace=true (1)
data:
example-sno-du_fh-numVfs: "8"
example-sno-du_fh-pf: ens1f0
example-sno-du_fh-priority: "10"
example-sno-du_fh-vlan: "140"
example-sno-du_mh-numVfs: "8"
example-sno-du_mh-pf: ens3f0
example-sno-du_mh-priority: "10"
example-sno-du_mh-vlan: "150"
1 | The argocd.argoproj.io/sync-options annotation is required only if the ConfigMap is larger than 1 MiB in size. |
The |
Commit the ConfigMap
CR in Git, and then push to the Git repository being monitored by the Argo CD application.
Create a site PGT CR that uses templates to pull the required data from the ConfigMap
object. For example:
apiVersion: ran.openshift.io/v1
kind: PolicyGenTemplate
metadata:
name: "site"
namespace: "ztp-site"
spec:
remediationAction: inform
bindingRules:
group-du-sno: ""
mcp: "master"
sourceFiles:
- fileName: SriovNetwork.yaml
policyName: "config-policy"
metadata:
name: "sriov-nw-du-fh"
spec:
resourceName: du_fh
vlan: '{{hub fromConfigMap "ztp-site" "sriovdata" (printf "%s-du_fh-vlan" .ManagedClusterName) | toInt hub}}'
- fileName: SriovNetworkNodePolicy.yaml
policyName: "config-policy"
metadata:
name: "sriov-nnp-du-fh"
spec:
deviceType: netdevice
isRdma: true
nicSelector:
pfNames:
- '{{hub fromConfigMap "ztp-site" "sriovdata" (printf "%s-du_fh-pf" .ManagedClusterName) | autoindent hub}}'
numVfs: '{{hub fromConfigMap "ztp-site" "sriovdata" (printf "%s-du_fh-numVfs" .ManagedClusterName) | toInt hub}}'
priority: '{{hub fromConfigMap "ztp-site" "sriovdata" (printf "%s-du_fh-priority" .ManagedClusterName) | toInt hub}}'
resourceName: du_fh
- fileName: SriovNetwork.yaml
policyName: "config-policy"
metadata:
name: "sriov-nw-du-mh"
spec:
resourceName: du_mh
vlan: '{{hub fromConfigMap "ztp-site" "sriovdata" (printf "%s-du_mh-vlan" .ManagedClusterName) | toInt hub}}'
- fileName: SriovNetworkNodePolicy.yaml
policyName: "config-policy"
metadata:
name: "sriov-nnp-du-mh"
spec:
deviceType: vfio-pci
isRdma: false
nicSelector:
pfNames:
- '{{hub fromConfigMap "ztp-site" "sriovdata" (printf "%s-du_mh-pf" .ManagedClusterName) hub}}'
numVfs: '{{hub fromConfigMap "ztp-site" "sriovdata" (printf "%s-du_mh-numVfs" .ManagedClusterName) | toInt hub}}'
priority: '{{hub fromConfigMap "ztp-site" "sriovdata" (printf "%s-du_mh-priority" .ManagedClusterName) | toInt hub}}'
resourceName: du_mh
Commit the site PolicyGenTemplate
CR in Git and push to the Git repository that is monitored by the ArgoCD application.
Subsequent changes to the referenced |
You can manage VLAN IDs for managed clusters in a single ConfigMap
CR and use hub cluster templates to populate the VLAN IDs in the generated polices that get applied to the clusters.
The following example shows how you how manage VLAN IDs in single ConfigMap
CR and apply them in individual cluster polices by using a single PolicyGenTemplate
group CR.
When using the |
You have installed the OpenShift CLI (oc
).
You have logged in to the hub cluster as a user with cluster-admin
privileges.
You have created a Git repository where you manage your custom site configuration data. The repository must be accessible from the hub cluster and be defined as a source repository for the Argo CD application.
Create a ConfigMap
CR that describes the VLAN IDs for a group of cluster hosts. For example:
apiVersion: v1
kind: ConfigMap
metadata:
name: site-data
namespace: ztp-group
annotations:
argocd.argoproj.io/sync-options: Replace=true (1)
data:
site-1-vlan: "101"
site-2-vlan: "234"
1 | The argocd.argoproj.io/sync-options annotation is required only if the ConfigMap is larger than 1 MiB in size. |
The |
Commit the ConfigMap
CR in Git, and then push to the Git repository being monitored by the Argo CD application.
Create a group PGT CR that uses a hub template to pull the required VLAN IDs from the ConfigMap
object. For example, add the following YAML snippet to the group PGT CR:
- fileName: SriovNetwork.yaml
policyName: "config-policy"
metadata:
name: "sriov-nw-du-mh"
annotations:
ran.openshift.io/ztp-deploy-wave: "10"
spec:
resourceName: du_mh
vlan: '{{hub fromConfigMap "" "site-data" (printf "%s-vlan" .ManagedClusterName) | toInt hub}}'
Commit the group PolicyGenTemplate
CR in Git, and then push to the Git repository being monitored by the Argo CD application.
Subsequent changes to the referenced |
You have installed the OpenShift CLI (oc
).
You have logged in to the hub cluster as a user with cluster-admin
privileges.
You have created a PolicyGenTemplate
CR that pulls information from a ConfigMap
CR using hub cluster templates.
Update the contents of your ConfigMap
CR, and apply the changes in the hub cluster.
To sync the contents of the updated ConfigMap
CR to the deployed policy, do either of the following:
Option 1: Delete the existing policy. ArgoCD uses the PolicyGenTemplate
CR to immediately recreate the deleted policy. For example, run the following command:
$ oc delete policy <policy_name> -n <policy_namespace>
Option 2: Apply a special annotation policy.open-cluster-management.io/trigger-update
to the policy with a different value every time when you update the ConfigMap
. For example:
$ oc annotate policy <policy_name> -n <policy_namespace> policy.open-cluster-management.io/trigger-update="1"
You must apply the updated policy for the changes to take effect. For more information, see Special annotation for reprocessing. |
Optional: If it exists, delete the ClusterGroupUpdate
CR that contains the policy. For example:
$ oc delete clustergroupupgrade <cgu_name> -n <cgu_namespace>
Create a new ClusterGroupUpdate
CR that includes the policy to apply with the updated ConfigMap
changes. For example, add the following YAML to the file cgr-example.yaml
:
apiVersion: ran.openshift.io/v1alpha1
kind: ClusterGroupUpgrade
metadata:
name: <cgr_name>
namespace: <policy_namespace>
spec:
managedPolicies:
- <managed_policy>
enable: true
clusters:
- <managed_cluster_1>
- <managed_cluster_2>
remediationStrategy:
maxConcurrency: 2
timeout: 240
Apply the updated policy:
$ oc apply -f cgr-example.yaml