×

Deploying additional changes to clusters

If you require cluster configuration changes outside of the base GitOps ZTP pipeline configuration, there are three options:

Apply the additional configuration after the ZTP pipeline is complete

When the GitOps ZTP pipeline deployment is complete, the deployed cluster is ready for application workloads. At this point, you can install additional Operators and apply configurations specific to your requirements. Ensure that additional configurations do not negatively affect the performance of the platform or allocated CPU budget.

Add content to the ZTP library

The base source custom resources (CRs) that you deploy with the GitOps ZTP pipeline can be augmented with custom content as required.

Create extra manifests for the cluster installation

Extra manifests are applied during installation and make the installation process more efficient.

Providing additional source CRs or modifying existing source CRs can significantly impact the performance or CPU profile of OKD.

Additional resources

Using PolicyGenTemplate CRs to override source CRs content

PolicyGenTemplate custom resources (CRs) allow you to overlay additional configuration details on top of the base source CRs provided with the GitOps plugin in the ztp-site-generate container. You can think of PolicyGenTemplate CRs as a logical merge or patch to the base CR. Use PolicyGenTemplate CRs to update a single field of the base CR, or overlay the entire contents of the base CR. You can update values and insert fields that are not in the base CR.

The following example procedure describes how to update fields in the generated PerformanceProfile CR for the reference configuration based on the PolicyGenTemplate CR in the group-du-sno-ranGen.yaml file. Use the procedure as a basis for modifying other parts of the PolicyGenTemplate based on your requirements.

Prerequisites
  • Create a Git repository where you manage your custom site configuration data. The repository must be accessible from the hub cluster and be defined as a source repository for Argo CD.

Procedure
  1. Review the baseline source CR for existing content. You can review the source CRs listed in the reference PolicyGenTemplate CRs by extracting them from the zero touch provisioning (ZTP) container.

    1. Create an /out folder:

      $ mkdir -p ./out
    2. Extract the source CRs:

      $ podman run --log-driver=none --rm registry.redhat.io/openshift4/ztp-site-generate-rhel8:v4.11.1 extract /home/ztp --tar | tar x -C ./out
  2. Review the baseline PerformanceProfile CR in ./out/source-crs/PerformanceProfile.yaml:

    apiVersion: performance.openshift.io/v2
    kind: PerformanceProfile
    metadata:
      name: $name
      annotations:
        ran.openshift.io/ztp-deploy-wave: "10"
    spec:
      additionalKernelArgs:
      - "idle=poll"
      - "rcupdate.rcu_normal_after_boot=0"
      cpu:
        isolated: $isolated
        reserved: $reserved
      hugepages:
        defaultHugepagesSize: $defaultHugepagesSize
        pages:
          - size: $size
            count: $count
            node: $node
      machineConfigPoolSelector:
        pools.operator.machineconfiguration.openshift.io/$mcp: ""
      net:
        userLevelNetworking: true
      nodeSelector:
        node-role.kubernetes.io/$mcp: ''
      numa:
        topologyPolicy: "restricted"
      realTimeKernel:
        enabled: true

    Any fields in the source CR which contain $…​ are removed from the generated CR if they are not provided in the PolicyGenTemplate CR.

  3. Update the PolicyGenTemplate entry for PerformanceProfile in the group-du-sno-ranGen.yaml reference file. The following example PolicyGenTemplate CR stanza supplies appropriate CPU specifications, sets the hugepages configuration, and adds a new field that sets globallyDisableIrqLoadBalancing to false.

    - fileName: PerformanceProfile.yaml
      policyName: "config-policy"
      metadata:
        name: openshift-node-performance-profile
      spec:
        cpu:
          # These must be tailored for the specific hardware platform
          isolated: "2-19,22-39"
          reserved: "0-1,20-21"
        hugepages:
          defaultHugepagesSize: 1G
          pages:
            - size: 1G
              count: 10
        globallyDisableIrqLoadBalancing: false
  4. Commit the PolicyGenTemplate change in Git, and then push to the Git repository being monitored by the GitOps ZTP argo CD application.

Example output

The ZTP application generates an RHACM policy that contains the generated PerformanceProfile CR. The contents of that CR are derived by merging the metadata and spec contents from the PerformanceProfile entry in the PolicyGenTemplate onto the source CR. The resulting CR has the following content:

---
apiVersion: performance.openshift.io/v2
kind: PerformanceProfile
metadata:
    name: openshift-node-performance-profile
spec:
    additionalKernelArgs:
        - idle=poll
        - rcupdate.rcu_normal_after_boot=0
    cpu:
        isolated: 2-19,22-39
        reserved: 0-1,20-21
    globallyDisableIrqLoadBalancing: false
    hugepages:
        defaultHugepagesSize: 1G
        pages:
            - count: 10
              size: 1G
    machineConfigPoolSelector:
        pools.operator.machineconfiguration.openshift.io/master: ""
    net:
        userLevelNetworking: true
    nodeSelector:
        node-role.kubernetes.io/master: ""
    numa:
        topologyPolicy: restricted
    realTimeKernel:
        enabled: true

In the /source-crs folder that you extract from the ztp-site-generate container, the $ syntax is not used for template substitution as implied by the syntax. Rather, if the policyGen tool sees the $ prefix for a string and you do not specify a value for that field in the related PolicyGenTemplate CR, the field is omitted from the output CR entirely.

An exception to this is the $mcp variable in /source-crs YAML files that is substituted with the specified value for mcp from the PolicyGenTemplate CR. For example, in example/policygentemplates/group-du-standard-ranGen.yaml, the value for mcp is worker:

spec:
  bindingRules:
    group-du-standard: ""
  mcp: "worker"

The policyGen tool replace instances of $mcp with worker in the output CRs.

Adding new content to the GitOps ZTP pipeline

The source CRs in the GitOps ZTP site generator container provide a set of critical features and node tuning settings for RAN Distributed Unit (DU) applications. These are applied to the clusters that you deploy with ZTP. To add or modify existing source CRs in the ztp-site-generate container, rebuild the ztp-site-generate container and make it available to the hub cluster, typically from the disconnected registry associated with the hub cluster. Any valid OKD CR can be added.

Perform the following procedure to add new content to the ZTP pipeline.

Procedure
  1. Create a directory containing a Containerfile and the source CR YAML files that you want to include in the updated ztp-site-generate container, for example:

    ztp-update/
    ├── example-cr1.yaml
    ├── example-cr2.yaml
    └── ztp-update.in
  2. Add the following content to the ztp-update.in Containerfile:

    FROM registry.redhat.io/openshift4/ztp-site-generate-rhel8:v4.11
    
    ADD example-cr2.yaml /kustomize/plugin/ran.openshift.io/v1/policygentemplate/source-crs/
    ADD example-cr1.yaml /kustomize/plugin/ran.openshift.io/v1/policygentemplate/source-crs/
  3. Open a terminal at the ztp-update/ folder and rebuild the container:

    $ podman build -t ztp-site-generate-rhel8-custom:v4.11-custom-1
  4. Push the built container image to your disconnected registry, for example:

    $ podman push localhost/ztp-site-generate-rhel8-custom:v4.11-custom-1 registry.example.com:5000/ztp-site-generate-rhel8-custom:v4.11-custom-1
  5. Patch the Argo CD instance on the hub cluster to point to the newly built container image:

    $ oc patch -n openshift-gitops argocd openshift-gitops --type=json -p '[{"op": "replace", "path":"/spec/repo/initContainers/0/image", "value": "registry.example.com:5000/ztp-site-generate-rhel8-custom:v4.11-custom-1"} ]'

    When the Argo CD instance is patched, the openshift-gitops-repo-server pod automatically restarts.

Verification
  1. Verify that the new openshift-gitops-repo-server pod has completed initialization and that the previous repo pod is terminated:

    $ oc get pods -n openshift-gitops | grep openshift-gitops-repo-server
    Example output
    openshift-gitops-server-7df86f9774-db682          1/1     Running   	     1          28s

    You must wait until the new openshift-gitops-repo-server pod has completed initialization and the previous pod is terminated before the newly added container image content is available.

Additional resources
  • Alternatively, you can patch the ArgoCD instance as described in Configuring the hub cluster with ArgoCD by modifying argocd-openshift-gitops-patch.json with an updated initContainer image before applying the patch file.

Configuring policy compliance evaluation timeouts for PolicyGenTemplate CRs

Use Red Hat Advanced Cluster Management (RHACM) installed on a hub cluster to monitor and report on whether your managed clusters are compliant with applied policies. RHACM uses policy templates to apply predefined policy controllers and policies. Policy controllers are Kubernetes custom resource definition (CRD) instances.

You can override the default policy evaluation intervals with PolicyGenTemplate custom resources (CRs). You configure duration settings that define how long a ConfigurationPolicy CR can be in a state of policy compliance or non-compliance before RHACM re-evaluates the applied cluster policies.

The zero touch provisioning (ZTP) policy generator generates ConfigurationPolicy CR policies with pre-defined policy evaluation intervals. The default value for the noncompliant state is 10 seconds. The default value for the compliant state is 10 minutes. To disable the evaluation interval, set the value to never.

Prerequisites
  • You have installed the OpenShift CLI (oc).

  • You have logged in to the hub cluster as a user with cluster-admin privileges.

  • You have created a Git repository where you manage your custom site configuration data.

Procedure
  1. To configure the evaluation interval for all policies in a PolicyGenTemplate CR, add evaluationInterval to the spec field, and then set the appropriate compliant and noncompliant values. For example:

    spec:
      evaluationInterval:
        compliant: 30m
        noncompliant: 20s
  2. To configure the evaluation interval for the spec.sourceFiles object in a PolicyGenTemplate CR, add evaluationInterval to the sourceFiles field, for example:

    spec:
      sourceFiles:
       - fileName: SriovSubscription.yaml
         policyName: "sriov-sub-policy"
         evaluationInterval:
           compliant: never
           noncompliant: 10s
  3. Commit the PolicyGenTemplate CRs files in the Git repository and push your changes.

Verification

Check that the managed spoke cluster policies are monitored at the expected intervals.

  1. Log in as a user with cluster-admin privileges on the managed cluster.

  2. Get the pods that are running in the open-cluster-management-agent-addon namespace. Run the following command:

    $ oc get pods -n open-cluster-management-agent-addon
    Example output
    NAME                                         READY   STATUS    RESTARTS        AGE
    config-policy-controller-858b894c68-v4xdb    1/1     Running   22 (5d8h ago)   10d
  3. Check the applied policies are being evaluated at the expected interval in the logs for the config-policy-controller pod:

    $ oc logs -n open-cluster-management-agent-addon config-policy-controller-858b894c68-v4xdb
    Example output
    2022-05-10T15:10:25.280Z       info   configuration-policy-controller controllers/configurationpolicy_controller.go:166      Skipping the policy evaluation due to the policy not reaching the evaluation interval  {"policy": "compute-1-config-policy-config"}
    2022-05-10T15:10:25.280Z       info   configuration-policy-controller controllers/configurationpolicy_controller.go:166      Skipping the policy evaluation due to the policy not reaching the evaluation interval  {"policy": "compute-1-common-compute-1-catalog-policy-config"}

Signalling ZTP cluster deployment completion with validator inform policies

Create a validator inform policy that signals when the zero touch provisioning (ZTP) installation and configuration of the deployed cluster is complete. This policy can be used for deployments of single-node OpenShift clusters, three-node clusters, and standard clusters.

Procedure
  1. Create a standalone PolicyGenTemplate custom resource (CR) that contains the source file validatorCRs/informDuValidator.yaml. You only need one standalone PolicyGenTemplate CR for each cluster type. For example, this CR applies a validator inform policy for single-node OpenShift clusters:

    Example single-node cluster validator inform policy CR (group-du-sno-validator-ranGen.yaml)
    apiVersion: ran.openshift.io/v1
    kind: PolicyGenTemplate
    metadata:
      name: "group-du-sno-validator" (1)
      namespace: "ztp-group" (2)
    spec:
      bindingRules:
        group-du-sno: "" (3)
      bindingExcludedRules:
        ztp-done: "" (4)
      mcp: "master" (5)
      sourceFiles:
        - fileName: validatorCRs/informDuValidator.yaml
          remediationAction: inform (6)
          policyName: "du-policy" (7)
    1 The name of PolicyGenTemplates object. This name is also used as part of the names for the placementBinding, placementRule, and policy that are created in the requested namespace.
    2 This value should match the namespace used in the group PolicyGenTemplates.
    3 The group-du-* label defined in bindingRules must exist in the SiteConfig files.
    4 The label defined in bindingExcludedRules must be`ztp-done:`. The ztp-done label is used in coordination with the Topology Aware Lifecycle Manager.
    5 mcp defines the MachineConfigPool object that is used in the source file validatorCRs/informDuValidator.yaml. It should be master for single node and three-node cluster deployments and worker for standard cluster deployments.
    6 Optional. The default value is inform.
    7 This value is used as part of the name for the generated RHACM policy. The generated validator policy for the single node example is group-du-sno-validator-du-policy.
  2. Commit the PolicyGenTemplate CR file in your Git repository and push the changes.

Additional resources

Configuring PTP fast events using PolicyGenTemplate CRs

You can configure PTP fast events for vRAN clusters that are deployed using the GitOps Zero Touch Provisioning (ZTP) pipeline. Use PolicyGenTemplate custom resources (CRs) as the basis to create a hierarchy of configuration files tailored to your specific site requirements.

Prerequisites
  • Create a Git repository where you manage your custom site configuration data.

Procedure
  1. Add the following YAML into .spec.sourceFiles in the common-ranGen.yaml file to configure the AMQP Operator:

    #AMQ interconnect operator for fast events
    - fileName: AmqSubscriptionNS.yaml
      policyName: "subscriptions-policy"
    - fileName: AmqSubscriptionOperGroup.yaml
      policyName: "subscriptions-policy"
    - fileName: AmqSubscription.yaml
      policyName: "subscriptions-policy"
  2. Apply the following PolicyGenTemplate changes to group-du-3node-ranGen.yaml, group-du-sno-ranGen.yaml, or group-du-standard-ranGen.yaml files according to your requirements:

    1. In .sourceFiles, add the PtpOperatorConfig CR file that configures the AMQ transport host to the config-policy:

      - fileName: PtpOperatorConfigForEvent.yaml
        policyName: "config-policy"
    2. Configure the linuxptp and phc2sys for the PTP clock type and interface. For example, add the following stanza into .sourceFiles:

      - fileName: PtpConfigSlave.yaml (1)
        policyName: "config-policy"
        metadata:
          name: "du-ptp-slave"
        spec:
          profile:
          - name: "slave"
            interface: "ens5f1" (2)
            ptp4lOpts: "-2 -s --summary_interval -4" (3)
            phc2sysOpts: "-a -r -m -n 24 -N 8 -R 16" (4)
          ptpClockThreshold: (5)
            holdOverTimeout: 30 #secs
            maxOffsetThreshold: 100  #nano secs
            minOffsetThreshold: -100 #nano secs
      1 Can be one PtpConfigMaster.yaml, PtpConfigSlave.yaml, or PtpConfigSlaveCvl.yaml depending on your requirements. PtpConfigSlaveCvl.yaml configures linuxptp services for an Intel E810 Columbiaville NIC. For configurations based on group-du-sno-ranGen.yaml or group-du-3node-ranGen.yaml, use PtpConfigSlave.yaml.
      2 Device specific interface name.
      3 You must append the --summary_interval -4 value to ptp4lOpts in .spec.sourceFiles.spec.profile to enable PTP fast events.
      4 Required phc2sysOpts values. -m prints messages to stdout. The linuxptp-daemon DaemonSet parses the logs and generates Prometheus metrics.
      5 Optional. If the ptpClockThreshold stanza is not present, default values are used for the ptpClockThreshold fields. The stanza shows default ptpClockThreshold values. The ptpClockThreshold values configure how long after the PTP master clock is disconnected before PTP events are triggered. holdOverTimeout is the time value in seconds before the PTP clock event state changes to FREERUN when the PTP master clock is disconnected. The maxOffsetThreshold and minOffsetThreshold settings configure offset values in nanoseconds that compare against the values for CLOCK_REALTIME (phc2sys) or master offset (ptp4l). When the ptp4l or phc2sys offset value is outside this range, the PTP clock state is set to FREERUN. When the offset value is within this range, the PTP clock state is set to LOCKED.
  3. Apply the following PolicyGenTemplate changes to your specific site YAML files, for example, example-sno-site.yaml:

    1. In .sourceFiles, add the Interconnect CR file that configures the AMQ router to the config-policy:

      - fileName: AmqInstance.yaml
        policyName: "config-policy"
  4. Merge any other required changes and files with your custom site repository.

  5. Push the changes to your site configuration repository to deploy PTP fast events to new sites using GitOps ZTP.

Additional resources

Configuring bare-metal event monitoring using PolicyGenTemplate CRs

You can configure bare-metal hardware events for vRAN clusters that are deployed using the GitOps Zero Touch Provisioning (ZTP) pipeline.

Prerequisites
  • Install the OpenShift CLI (oc).

  • Log in as a user with cluster-admin privileges.

  • Create a Git repository where you manage your custom site configuration data.

Procedure
  1. To configure the AMQ Interconnect Operator and the Bare Metal Event Relay Operator, add the following YAML to spec.sourceFiles in the common-ranGen.yaml file:

    # AMQ interconnect operator for fast events
    - fileName: AmqSubscriptionNS.yaml
      policyName: "subscriptions-policy"
    - fileName: AmqSubscriptionOperGroup.yaml
      policyName: "subscriptions-policy"
    - fileName: AmqSubscription.yaml
      policyName: "subscriptions-policy"
    # Bare Metal Event Rely operator
    - fileName: BareMetalEventRelaySubscriptionNS.yaml
      policyName: "subscriptions-policy"
    - fileName: BareMetalEventRelaySubscriptionOperGroup.yaml
      policyName: "subscriptions-policy"
    - fileName: BareMetalEventRelaySubscription.yaml
      policyName: "subscriptions-policy"
  2. Add the Interconnect CR to .spec.sourceFiles in the site configuration file, for example, the example-sno-site.yaml file:

    - fileName: AmqInstance.yaml
      policyName: "config-policy"
  3. Add the HardwareEvent CR to spec.sourceFiles in your specific group configuration file, for example, in the group-du-sno-ranGen.yaml file:

    - fileName: HardwareEvent.yaml
      policyName: "config-policy"
      spec:
        nodeSelector: {}
        transportHost: "amqp://<amq_interconnect_name>.<amq_interconnect_namespace>.svc.cluster.local" (1)
        logLevel: "info"
    1 The transportHost URL is composed of the existing AMQ Interconnect CR name and namespace. For example, in transportHost: "amqp://amq-router.amq-router.svc.cluster.local", the AMQ Interconnect name and namespace are both set to amq-router.

    Each baseboard management controller (BMC) requires a single HardwareEvent resource only.

  4. Commit the PolicyGenTemplate change in Git, and then push the changes to your site configuration repository to deploy bare-metal events monitoring to new sites using GitOps ZTP.

  5. Create the Redfish Secret by running the following command:

    $ oc -n openshift-bare-metal-events create secret generic redfish-basic-auth \
    --from-literal=username=<bmc_username> --from-literal=password=<bmc_password> \
    --from-literal=hostaddr="<bmc_host_ip_addr>"
Additional resources
Additional resources