Ɨ

Understanding Intel hardware accelerator cards for OKD

Hardware accelerator cards from Intel accelerate 4G/LTE and 5G Virtualized Radio Access Networks (vRAN) workloads. This in turn increases the overall compute capacity of a commercial, off-the-shelf platform.

Intel FPGA PAC N3000

The Intel FPGA PAC N3000 is a reference FPGA and uses 4G/LTE or 5G forward error correction (FEC) as an example workload that accelerates the 5G or 4G/LTE RAN layer 1 (L1) base station network function. Flash the Intel FPGA PAC N3000 card with 4G/LTE or 5G bitstreams to support vRAN workloads.

The Intel FPGA PAC N3000 is a full-duplex, 100 Gbps in-system, re-programmable acceleration card for multi-workload networking application acceleration.

When the Intel FPGA PAC N3000 is programmed with a 4G/LTE or 5G bitstream, it exposes the Single Root I/O Virtualization (SR-IOV) virtual function (VF) devices used to accelerate the FEC in the vRAN workload. To take advantage of this functionality for a cloud-native deployment, the physical function (PF) of the device must be bound to the pf-pci-stub driver to create several VFs. After the VFs are created, the VFs must be bound to a DPDK userspace driver (vfio) to allocate them to specific pods running the vRAN workload.

Intel FPGA PAC N3000 support on OKD depends on two Operators:

  • OpenNESS Operator for Intel FPGA PAC N3000 (Programming)

  • OpenNESS Operator for Wireless FEC Accelerators

vRAN Dedicated Accelerator ACC100

The vRAN Dedicated Accelerator ACC100, based on Intelā€™s eASIC technology is designed to offload and accelerate the computing-intensive process of forward error correction for 4G/LTE and 5G technology, freeing up processing power. Intel eASIC devices are structured ASICs, an intermediate technology between FPGAs and standard application-specific integrated circuits (ASICs).

Intel vRAN Dedicated Accelerator ACC100 support on OKD uses one Operator:

  • OpenNESS Operator for Wireless FEC Accelerators

Installing the OpenNESS Operator for Intel FPGA PAC N3000

The OpenNESS Operator for Intel FPGA PAC N3000 orchestrates and manages the resources or devices exposed by the Intel FPGA PAC N3000 card within the OKD cluster.

For vRAN use cases, the OpenNESS Operator for Intel FPGA PAC N3000 is used with the OpenNESS Operator for Wireless FEC Accelerators.

As a cluster administrator, you can install the OpenNESS Operator for Intel FPGA PAC N3000 by using the OKD CLI or the web console.

Installing the Operator by using the CLI

As a cluster administrator, you can install the Operator by using the CLI.

Prerequisites
  • A cluster installed on bare-metal hardware.

  • Install the OpenShift CLI (oc).

  • Log in as a user with cluster-admin privileges.

Procedure
  1. Create a namespace for the N3000 Operator by completing the following actions:

    1. Define the vran-acceleration-operators namespace by creating a file named n3000-namespace.yaml file as shown in the following example:

      apiVersion: v1
      kind: Namespace
      metadata:
          name: vran-acceleration-operators
          labels:
              openshift.io/cluster-monitoring: "true"
    2. Create the namespace by running the following command:

      $ oc create -f n3000-namespace.yaml
  2. Install the N3000 Operator in the namespace you created in the previous step:

    1. Create the following OperatorGroup CR and save the YAML in the n3000-operatorgroup.yaml file:

      apiVersion: operators.coreos.com/v1
      kind: OperatorGroup
      metadata:
          name: n3000-operators
          namespace: vran-acceleration-operators
      spec:
          targetNamespaces:
          - vran-acceleration-operators
    2. Create the OperatorGroup CR by running the following command:

      $ oc create -f n3000-operatorgroup.yaml
    3. Run the following command to get the channel value required for the next step.

      $ oc get packagemanifest n3000 -n openshift-marketplace -o jsonpath='{.status.defaultChannel}'
      Example output

      stable
    4. Create the following Subscription CR and save the YAML in the n3000-sub.yaml file:

      apiVersion: operators.coreos.com/v1alpha1
      kind: Subscription
      metadata:
          name: n3000-subscription
          namespace: vran-acceleration-operators
      spec:
          channel: "<channel>"        (1)
          name: n3000
          source: certified-operators (2)
          sourceNamespace: openshift-marketplace
      1 Specify the value for channel from the value obtained in the previous step for the .status.defaultChannel parameter.
      2 You must specify the certified-operators value.
    5. Create the Subscription CR by running the following command:

      $ oc create -f n3000-sub.yaml
Verification
  • Verify the Operator is installed:

    $ oc get csv
    Example output

    NAME            DISPLAY                                         VERSION  REPLACES    PHASE
    n3000.v1.1.0    OpenNESS Operator for IntelĀ® FPGA PAC N3000     1.1.0                Succeeded

    You have now successfully installed the Operator.

Installing the OpenNESS Operator for Intel FPGA PAC N3000 Operator by using the web console

As a cluster administrator, you can install the OpenNESS Operator for Intel FPGA PAC N3000 by using the web console.

You must create the Namespace and OperatorGroup CR as mentioned in the previous section.

Procedure
  1. Install the OpenNESS Operator for Intel FPGA PAC N3000 by using the OKD web console:

    1. In the OKD web console, click Operators ā†’ OperatorHub.

    2. Choose OpenNESS Operator for Intel FPGA PAC N3000 from the list of available Operators, and then click Install.

    3. On the Install Operator page, select All namespaces on the cluster. Then, click Install.

  2. Optional: Verify that the N3000 Operator is installed successfully:

    1. Switch to the Operators ā†’ Installed Operators page.

    2. Ensure that OpenNESS Operator for Intel FPGA PAC N3000 is listed in the vran-acceleration-operators project with a Status of InstallSucceeded.

      During installation, an Operator might display a Failed status. If the installation later succeeds with an InstallSucceeded message, you can ignore the Failed message.

      If the console does not indicate that the Operator is installed, perform the following troubleshooting steps:

      • Go to the Operators ā†’ Installed Operators page and inspect the Operator Subscriptions and Install Plans tabs for any failure or errors under Status.

      • Go to the Workloads ā†’ Pods page and check the logs for pods in the vran-acceleration-operators project.

Programming the OpenNESS Operator for Intel FPGA PAC N3000

When the Intel FPGA PAC N3000 is programmed with a vRAN 5G bitstream, the hardware exposes the Intel FPGA PAC N3000 with a vRAN 5G bitstream. This bitstream exposes the Single Root I/O Virtualization (SR-IOV) virtual function (VF) devices used to accelerate the FEC in the vRAN workload.

As a cluster administrator, you can install the OpenNESS Operator for Intel FPGA PAC N3000 by using the OKD CLI or the web console.

Programming the N3000 with a vRAN bitstream

As a cluster administrator, you can program the Intel FPGA PAC N3000 with a vRAN 5G bitstream. This bitstream exposes the Single Root I/O Virtualization (SR-IOV) virtual function (VF) devices that are used to accelerate the forward error correction (FEC) in the vRAN workload.

The role of forward error correction (FEC) is to correct transmission errors, where certain bits in a message can be lost or garbled. Messages can be lost or garbled due to noise in the transmission media, interference, or low signal strength. Without FEC, a garbled message would have to be resent, adding to the network load and impacting both throughput and latency.

Prerequisites
  • Intel FPGA PAC N3000 card

  • Performance Addon Operator with RT kernel configuration

  • Node or nodes installed with the OpenNESS Operator for Intel FPGA PAC N3000

  • Log in as a user with cluster-admin privileges

    All the commands run in the vran-acceleration-operators namespace.

Procedure
  1. Change to the vran-acceleration-operators project:

    $ oc project vran-acceleration-operators
  2. Verify that the pods are running:

    $ oc get pods
    Example output

    NAME                                        READY       STATUS          RESTARTS    AGE
    fpga-driver-daemonset-8xz4c                 1/1         Running         0           15d
    fpgainfo-exporter-vhvdq                     1/1         Running         1           15d
    N3000-controller-manager-b68475c76-gcc6v    2/2         Running         1           15d
    N3000-daemonset-5k55l                       1/1         Running         1           15d
    N3000-discovery-blmjl                       1/1         Running         1           15d
    N3000-discovery-lblh7                       1/1         Running         1           15d

    The following section provides information on the installed pods:

    • fpga-driver-daemonset provides and loads the required Open Programmable Accelerator Engine (OPAE) drivers

    • fpgainfo-exporter provides N3000 telemetry data for Prometheus

    • N3000-controller-manager applies N3000Node CRs to the cluster and manages all the operand containers

    • N3000-daemonset is the main worker application. It monitors the changes in each nodeā€™s CR and acts on the changes. The logic implemented into this Daemon takes care of updating the cards' FPGA user image and NIC firmware. It is also responsible for draining the nodes and taking them out of commission when required by the update.

    • N3000-discovery discovers N3000 Accelerator devices installed and labels worker nodes if devices are present

  3. Get all the nodes containing the Intel FPGA PAC N3000 card:

    $ oc get n3000node
    Example output

    NAME                       FLASH
    node1                      NotRequested
  4. Get information about the card on each node:

    $ oc get n3000node node1 -o yaml
    Example output

    status:
      conditions:
      - lastTransitionTime: "2020-12-15T17:09:26Z"
        message: Inventory up to date
        observedGeneration: 1
        reason: NotRequested
        status: "False"
        type: Flashed
      fortville:
      - N3000PCI: 0000:1b:00.0
        NICs:
        - MAC: 64:4c:36:11:1b:a8
          NVMVersion: 7.00 0x800052b0 0.0.0
          PCIAddr: 0000:1a:00.0
          name: Ethernet Controller XXV710 Intel(R) FPGA Programmable Acceleration Card N3000 for Networking
        - MAC: 64:4c:36:11:1b:a9
          NVMVersion: 7.00 0x800052b0 0.0.0
          PCIAddr: 0000:1a:00.1
          name: Ethernet Controller XXV710 Intel(R) FPGA Programmable Acceleration Card N3000 for Networking
        - MAC: 64:4c:36:11:1b:ac
          NVMVersion: 7.00 0x800052b0 0.0.0
          PCIAddr: 0000:1c:00.0
          name: Ethernet Controller XXV710 Intel(R) FPGA Programmable Acceleration Card N3000 for Networking
        - MAC: 64:4c:36:11:1b:ad
          NVMVersion: 7.00 0x800052b0 0.0.0
          PCIAddr: 0000:1c:00.1
          name: Ethernet Controller XXV710 Intel(R) FPGA Programmable Acceleration Card N3000 for Networking
      fpga:
      - PCIAddr: 0000:1b:00.0 (1)
        bitstreamId: "0x23000410010310" (2)
        bitstreamVersion: 0.2.3
        deviceId: "0x0b30"
    1 The PCIAddr field indicates the PCI address of the card.
    2 The bitstreamId field indicates the bitstream that is currently stored in flash.
  5. Save the current bitstreamId, PCIAddr, the name, and the deviceId without "0x" padding.

    $ oc get n3000node -o json
  6. Update the user bitstream of the Intel FPGA PAC N3000 card:

    1. Define the N3000 cluster resource to program by creating a file named n3000-cluster.yaml as shown in the following example:

      apiVersion: fpga.intel.com/v1
      kind: N3000Cluster
      metadata:
          name: n3000 (1)
          namespace: vran-acceleration-operators
      spec:
          nodes:
            - nodeName: "node1" (2)
              fpga:
                - userImageURL: "http://10.10.10.122:8000/pkg/20ww27.5-2x2x25G-5GLDPC-v1.6.1-3.0.0_unsigned.bin" (3)
                  PCIAddr: "0000:1b:00.0" (4)
                  checksum: "0b0a87b974d35ea16023ceb57f7d5d9c" (5)
      1 Specify the name. The name must be n3000.
      2 Specify the node to program.
      3 Specify the URL for the user bitstream. This bitstream file must be accessible on an HTTP or HTTPS server.
      4 Specify the PCI address of the card to program.
      5 Specify the MD5 checksum of the bitstream that is specified in the userImageURL field.

      The N3000 daemon updates the FPGA user bitstream using the Open Programmable Acceleration Engine (OPAE) tools and resets the PCI device. The update of the FPGA user bitstream can require up to 40 minutes per card. For programming cards on multiple nodes, the programming happens one node at a time.

    2. Apply the update to begin programming the card with the bitstream:

      $ oc apply -f n3000-cluster.yaml

      The N3000 daemon starts programming the bitstream after the appropriate 5G FEC user bitstream has been provisioned, such as 20ww27.5-2x2x25G-5GLDPC-v1.6.1-3.0.0_unsigned.bin in this example, and after the CR has been created.

    3. Check the status:

      oc get n3000node
      Example output

      NAME             FLASH
      node1            InProgress
  7. Check the logs:

    1. Determine the pod name of the N3000 daemon:

      $ oc get pod -o wide | grep n3000-daemonset | grep node1
      Example output

      n3000-daemonset-5k55l              1/1     Running   0          15d
    2. View the logs:

      $ oc logs n3000-daemonset-5k55l
      Example output

      ...
      {"level":"info","ts":1608054338.8866854,"logger":"daemon.drainhelper.cordonAndDrain()","msg":"node drained"}
      {"level":"info","ts":1608054338.8867319,"logger":"daemon.drainhelper.Run()","msg":"worker function - start"}
      {"level":"info","ts":1608054338.9003832,"logger":"daemon.fpgaManager.ProgramFPGAs","msg":"Start program","PCIAddr":"0000:1b:00.0"}
      {"level":"info","ts":1608054338.9004142,"logger":"daemon.fpgaManager.ProgramFPGA","msg":"Starting","pci":"0000:1b:00.0"}
      {"level":"info","ts":1608056309.9367146,"logger":"daemon.fpgaManager.ProgramFPGA","msg":"Program FPGA completed, start new power cycle N3000 ...","pci":"0000:1b:00.0"}
      {"level":"info","ts":1608056333.3528838,"logger":"daemon.drainhelper.Run()","msg":"worker function - end","performUncordon":true}
      ...

      The log file indicates the following flow of events:

      • The bitstream is downloaded and validated.

      • The node is drained and no workload is able to run during this time.

      • Flashing is started:

        • The bitstream is flashed into the card.

        • The bitstream is applied.

      • After flashing is complete the PCI device or devices on the node or nodes are reloaded. The OpenNESS SR-IOV Operator for Wireless FEC Accelerators is now able to find the new flashed device or devices.

Verification
  1. Verify the status after the FPGA user bitstream update is complete:

    oc get n3000node
    Example output

    NAME             FLASH
    node1            Succeeded
  2. Verify that the bitstream ID of the card has changed:

    oc get n3000node node1 -o yaml
    Example output

    status:
          conditions:
              - lastTransitionTime: "2020-12-15T18:18:53Z"
                message: Flashed successfully (1)
                observedGeneration: 2
                reason: Succeeded
                status: "True"
                type: Flashed
          fortville:
          - N3000PCI: 0000:1b:00.0
            NICs:
            - MAC: 64:4c:36:11:1b:a8
              NVMVersion: 7.00 0x800052b0 0.0.0
              PCIAddr: 0000:1a:00.0
              name: Ethernet Controller XXV710 Intel(R) FPGA Programmable Acceleration Card N3000 for Networking
            - MAC: 64:4c:36:11:1b:a9
              NVMVersion: 7.00 0x800052b0 0.0.0
              PCIAddr: 0000:1a:00.1
              name: Ethernet Controller XXV710 Intel(R) FPGA Programmable Acceleration Card N3000 for Networking
            - MAC: 64:4c:36:11:1b:ac
              NVMVersion: 7.00 0x800052b0 0.0.0
              PCIAddr: 0000:1c:00.0
              name: Ethernet Controller XXV710 Intel(R) FPGA Programmable Acceleration Card N3000 for Networking
            - MAC: 64:4c:36:11:1b:ad
              NVMVersion: 7.00 0x800052b0 0.0.0
              PCIAddr: 0000:1c:00.1
              name: Ethernet Controller XXV710 Intel(R) FPGA Programmable Acceleration Card N3000 for Networking
          fpga:
          - PCIAddr: 0000:1b:00.0 (2)
            bitstreamId: "0x2315842A010601" (3)
            bitstreamVersion: 0.2.3
            deviceId: "0x0b30" (4)
    1 The message field indicates the device is successfully flashed.
    2 The PCIAddr field indicates the PCI address of the card.
    3 The bitstreamId field indicates the updated bitstream ID.
    4 The deviceID field indicates that device ID of the bitstream inside the card exposed to the system.
  3. Check the FEC PCI devices on the node:

    1. Verify the node configuration is applied correctly:

      $ oc debug node/node1
      Expected output

      Starting pod/<node-name>-debug ...
      To use host binaries, run `chroot /host`
      
      Pod IP: <ip-address>
      If you don't see a command prompt, try pressing enter.
      
      sh-4.4#
    2. Verify that you can use the node file system:

      sh-4.4# chroot /host
      Expected output

      sh-4.4#
    3. List the PCI devices associated with the accelerator on your system:

      $ lspci | grep accelerators
      Expected output

      1b:00.0 Processing accelerators: Intel Corporation Device 0b30
      1d:00.0 Processing accelerators: Intel Corporation Device 0d8f (rev 01)

      Devices belonging to the FPGA are reported in the output. Device ID 0b30 is the RSU interface used to program the card, and the 0d8f is a physical function of the newly programmed 5G device.

Installing the OpenNESS SR-IOV Operator for Wireless FEC Accelerators

The role of the OpenNESS SR-IOV Operator for Wireless FEC Accelerators is to orchestrate and manage the devices exposed by a range of Intel vRAN FEC acceleration hardware within the OKD cluster.

One of the most compute-intensive 4G/LTE and 5G workloads is RAN layer 1 (L1) forward error correction (FEC). FEC resolves data transmission errors over unreliable or noisy communication channels. FEC technology detects and corrects a limited number of errors in 4G/LTE or 5G data without the need for retransmission.

The FEC devices are provided by the Intel FPGA PAC N3000 and the Intel vRAN Dedicated Accelerator ACC100 for the vRAN use case.

The Intel FPGA PAC N3000 FPGA requires flashing with an 4G/LTE or 5G bitstream.

The OpenNESS SR-IOV Operator for Wireless FEC Accelerators provides functionality to create virtual functions (VFs) for the FEC device, binds them to appropriate drivers, and configures the VFs queues for functionality in 4G/LTE or 5G deployment.

As a cluster administrator, you can install the OpenNESS SR-IOV Operator for Wireless FEC Accelerators by using the OKD CLI or the web console.

Installing the OpenNESS SR-IOV Operator for Wireless FEC Accelerators by using the CLI

As a cluster administrator, you can install the OpenNESS SR-IOV Operator for Wireless FEC Accelerators by using the CLI.

Prerequisites
  • A cluster installed on bare-metal hardware.

  • Install the OpenShift CLI (oc).

  • Log in as a user with cluster-admin privileges.

Procedure
  1. Create a namespace for the OpenNESS SR-IOV Operator for Wireless FEC Accelerators by completing the following actions:

    1. Define the vran-acceleration-operators namespace by creating a file named sriov-namespace.yaml as shown in the following example:

      apiVersion: v1
      kind: Namespace
      metadata:
          name: vran-acceleration-operators
          labels:
             openshift.io/cluster-monitoring: "true"
    2. Create the namespace by running the following command:

      $ oc create -f sriov-namespace.yaml
  2. Install the OpenNESS SR-IOV Operator for Wireless FEC Accelerators in the namespace you created in the previous step by creating the following objects:

    1. Create the following OperatorGroup CR and save the YAML in the sriov-operatorgroup.yaml file:

      apiVersion: operators.coreos.com/v1
      kind: OperatorGroup
      metadata:
          name: vran-operators
          namespace: vran-acceleration-operators
      spec:
          targetNamespaces:
            - vran-acceleration-operators
    2. Create the OperatorGroup CR by running the following command:

      $ oc create -f sriov-operatorgroup.yaml
    3. Run the following command to get the channel value required for the next step.

      $ oc get packagemanifest sriov-fec -n openshift-marketplace -o jsonpath='{.status.defaultChannel}'
      Example output

      stable
    4. Create the following Subscription CR and save the YAML in the sriov-sub.yaml file:

      apiVersion: operators.coreos.com/v1alpha1
      kind: Subscription
      metadata:
          name: sriov-fec-subscription
          namespace: vran-acceleration-operators
      spec:
          channel: "<channel>" (1)
          name: sriov-fec
          source: certified-operators (2)
          sourceNamespace: openshift-marketplace
      1 Specify the value for channel from the value obtained in the previous step for the .status.defaultChannel parameter.
      2 You must specify the certified-operators value.
    5. Create the Subscription CR by running the following command:

      $ oc create -f sriov-sub.yaml
Verification
  • Verify that the Operator is installed:

    $ oc get csv -n vran-acceleration-operators -o custom-columns=Name:.metadata.name,Phase:.status.phase
    Example output

    Name                                        Phase
    sriov-fec.v1.1.0                            Succeeded

Installing the OpenNESS SR-IOV Operator for Wireless FEC Accelerators by using the web console

As a cluster administrator, you can install the OpenNESS SR-IOV Operator for Wireless FEC Accelerators by using the web console.

You must create the Namespace and OperatorGroup CR as mentioned in the previous section.

Procedure
  1. Install the OpenNESS SR-IOV Operator for Wireless FEC Accelerators by using the OKD web console:

    1. In the OKD web console, click Operators ā†’ OperatorHub.

    2. Choose OpenNESS SR-IOV Operator for Wireless FEC Accelerators from the list of available Operators, and then click Install.

    3. On the Install Operator page, select All namespaces on the cluster. Then, click Install.

  2. Optional: Verify that the SRIOV-FEC Operator is installed successfully:

    1. Switch to the Operators ā†’ Installed Operators page.

    2. Ensure that OpenNESS SR-IOV Operator for Wireless FEC Accelerators is listed in the vran-acceleration-operators project with a Status of InstallSucceeded.

      During installation an Operator might display a Failed status. If the installation later succeeds with an InstallSucceeded message, you can ignore the Failed message.

      If the console does not indicate that the Operator is installed, perform the following troubleshooting steps:

      • Go to the Operators ā†’ Installed Operators page and inspect the Operator Subscriptions and Install Plans tabs for any failure or errors under Status.

      • Go to the Workloads ā†’ Pods page and check the logs for pods in the vran-acceleration-operators project.

Configuring the SR-IOV-FEC Operator for Intel FPGA PAC N3000

This section describes how to program the SR-IOV-FEC Operator for Intel FPGA PAC N3000. The SR-IOV-FEC Operator handles the management of the forward error correction (FEC) devices that are used to accelerate the FEC process in vRAN L1 applications.

Configuring the SR-IOV-FEC Operator involves:

  • Creating the desired virtual functions (VFs) for the FEC device

  • Binding the VFs to the appropriate drivers

  • Configuring the VF queues for desired functionality in a 4G or 5G deployment

The role of forward error correction (FEC) is to correct transmission errors, where certain bits in a message can be lost or garbled. Messages can be lost or garbled due to noise in the transmission media, interference, or low signal strength. Without FEC, a garbled message would have to be resent, adding to the network load and impacting throughput and latency.

Prerequisites
  • Intel FPGA PAC N3000 card

  • Node or nodes installed with the OpenNESS Operator for Intel FPGA PAC N3000 (Programming)

  • Node or nodes installed with the OpenNESS Operator for Wireless FEC Accelerators

  • RT kernel configured with Performance Addon Operator

Procedure
  1. Change to the vran-acceleration-operators project:

    $ oc project vran-acceleration-operators
  2. Verify that the SR-IOV-FEC Operator is installed:

    $ oc get csv -o custom-columns=Name:.metadata.name,Phase:.status.phase
    Example output

    Name                                        Phase
    sriov-fec.v1.1.0                            Succeeded
    n3000.v1.1.0                                Succeeded
  3. Verify that the N3000 and sriov-fec pods are running:

    $  oc get pods
    Example output

    NAME                                            READY       STATUS      RESTARTS    AGE
    fpga-driver-daemonset-8xz4c                     1/1         Running     0           15d
    fpgainfo-exporter-vhvdq                         1/1         Running     1           15d
    N3000-controller-manager-b68475c76-gcc6v        2/2         Running     1           15d
    N3000-daemonset-5k55l                           1/1         Running     1           15d
    N3000-discovery-blmjl                           1/1         Running     1           15d
    N3000-discovery-lblh7                           1/1         Running     1           15d
    sriov-device-plugin-j5jlv                       1/1         Running     1           15d
    sriov-fec-controller-manager-85b6b8f4d4-gd2qg   1/1         Running     1           15d
    sriov-fec-daemonset-kqqs6                       1/1         Running     1           15d

    The following section provides information on the installed pods:

    • fpga-driver-daemonset provides and loads the required Open Programmable Accelerator Engine (OPAE) drivers

    • fpgainfo-exporter provides N3000 telemetry data for Prometheus

    • N3000-controller-manager applies N3000Node CRs to the cluster and manages all the operand containers

    • N3000-daemonset is the main worker application

    • N3000-discovery discovers N3000 Accelerator devices installed and labels worker nodes if devices are present

    • sriov-device-plugin expose the FEC virtual functions as resources under the node

    • sriov-fec-controller-manager applies CR to the node and maintains the operands containers

    • sriov-fec-daemonset is responsible for:

      • Discovering the SRIOV NICs on each node.

      • Syncing the status of the custom resource (CR) defined in step 6.

      • Taking the spec of the CR as input and configuring the discovered NICs.

  4. Retrieve all the nodes containing one of the supported vRAN FEC accelerator devices:

    $ oc get sriovfecnodeconfig
    Example output

    NAME             CONFIGURED
    node1            Succeeded
  5. Find the physical function (PF) of the SR-IOV FEC accelerator device to configure:

    $ oc get sriovfecnodeconfig node1 -o yaml
    Example output

    status:
        conditions:
        - lastTransitionTime: "2021-03-19T17:19:37Z"
          message: Configured successfully
          observedGeneration: 1
          reason: ConfigurationSucceeded
          status: "True"
          type: Configured
        inventory:
            sriovAccelerators:
            - deviceID: 0d5c
              driver: ""
              maxVirtualFunctions: 16
              pciAddress: 0000.1d.00.0 (1)
              vendorID: "8086"
              virtualFunctions: [] (2)
    1 This field indicates the PCI Address of the card.
    2 This field shows that the virtual functions are empty.
  6. Configure the FEC device with the desired setting.

    1. Create the following custom resource (CR) and save the YAML in the sriovfec_n3000_cr.yaml file:

      apiVersion: sriovfec.intel.com/v1
      kind: SriovFecClusterConfig
      metadata:
          name: config
          namespace: vran-acceleration-operators
      spec:
        nodes:
          - nodeName: node1 (1)
            physicalFunctions:
              - pciAddress: 0000:1d:00.0 (2)
                pfDriver: pci-pf-stub
                vfDriver: vfio-pci
                vfAmount: 2 (3)
                bbDevConfig:
                  n3000:
                    # Network Type: either "FPGA_5GNR" or "FPGA_LTE"
                    networkType: "FPGA_5GNR"
                    pfMode: false
                    flrTimeout: 610
                    downlink:
                      bandwidth: 3
                      loadBalance: 128
                      queues: (4)
                        vf0: 16
                        vf1: 16
                        vf2: 0
                        vf3: 0
                        vf4: 0
                        vf5: 0
                        vf6: 0
                        vf7: 0
                    uplink:
                      bandwidth: 3
                      loadBalance: 128
                      queues: (5)
                        vf0: 16
                        vf1: 16
                        vf2: 0
                        vf3: 0
                        vf4: 0
                        vf5: 0
                        vf6: 0
                        vf7: 0
      1 Specify the node name.
      2 Specify the PCI Address of the card on which the SR-IOV-FEC Operator will be installed.
      3 Specify the number of virtual functions. Create two virtual functions.
      4 On vf0 create one queue with 16 buses (downlink and uplink).
      5 On vf1 create one queue with 16 buses (downlink and uplink).

      For Intel PAC N3000 for vRAN Acceleration the user can create up to 8 VF devices. Each FEC PF device provides a total of 64 queues to be configured, 32 queues for uplink and 32 queues for downlink. The queues would be typically distributed evenly across the VFs.

    2. Apply the CR:

      $ oc apply -f sriovfec_n3000_cr.yaml

      After applying the CR, the SR-IOV FEC daemon starts configuring the FEC device.

Verification
  1. Check the status:

    $ oc get sriovfecclusterconfig config -o yaml
    Example output

    status:
        conditions:
        - lastTransitionTime: "2020-12-15T17:19:37Z"
          message: Configured successfully
          observedGeneration: 1
          reason: ConfigurationSucceeded
          status: "True"
          type: Configured
        inventory:
          sriovAccelerators:
          - deviceID: 0d8f
            driver: pci-pf-stub
            maxVirtualFunctions: 8
            pciAddress: 0000:1d:00.0
            vendorID: "8086"
            virtualFunctions:
            - deviceID: 0d90
              driver: vfio-pci
              pciAddress: 0000:1d:00.1
            - deviceID: 0d90
              driver: vfio-pci
              pciAddress: 0000:1d:00.2
  2. Check the logs:

    1. Determine the name of the SR-IOV daemon pod:

      $ oc get pod | grep sriov-fec-daemonset
      Example output

      sriov-fec-daemonset-kqqs6                      1/1     Running   0          19h
    2. View the logs:

      $ oc logs sriov-fec-daemonset-kqqs6
      Example output

      2020-12-16T12:46:47.720Z        INFO    daemon.NodeConfigurator.applyConfig     configuring PF  {"requestedConfig": {"pciAddress":"0000:1d:00.0","pfDriver":"pci-pf-stub","vfDriver":"vfio-pci","vfAmount":2,"bbDevConfig":{"n3000":{
      "networkType":"FPGA_5GNR","pfMode":false,"flrTimeout":610,"downlink":{"bandwidth":3,"loadBalance":128,"queues":{"vf0":16,"vf1":16}},"uplink":{"bandwidth":3,"loadBalance":128,"queues":{"vf0":16,"vf1":16}}}}}}
      2020-12-16T12:46:47.720Z        INFO    daemon.NodeConfigurator.loadModule      executing command       {"cmd": "/usr/sbin/chroot /host/ modprobe pci-pf-stub"}
      2020-12-16T12:46:47.724Z        INFO    daemon.NodeConfigurator.loadModule      commands output {"output": ""}
      2020-12-16T12:46:47.724Z        INFO    daemon.NodeConfigurator.loadModule      executing command       {"cmd": "/usr/sbin/chroot /host/ modprobe vfio-pci"}
      2020-12-16T12:46:47.727Z        INFO    daemon.NodeConfigurator.loadModule      commands output {"output": ""}
      2020-12-16T12:46:47.727Z        INFO    daemon.NodeConfigurator device's driver_override path   {"path": "/sys/bus/pci/devices/0000:1d:00.0/driver_override"}
      2020-12-16T12:46:47.727Z        INFO    daemon.NodeConfigurator driver bind path        {"path": "/sys/bus/pci/drivers/pci-pf-stub/bind"}
      2020-12-16T12:46:47.998Z        INFO    daemon.NodeConfigurator device's driver_override path   {"path": "/sys/bus/pci/devices/0000:1d:00.1/driver_override"}
      2020-12-16T12:46:47.998Z        INFO    daemon.NodeConfigurator driver bind path        {"path": "/sys/bus/pci/drivers/vfio-pci/bind"}
      2020-12-16T12:46:47.998Z        INFO    daemon.NodeConfigurator device's driver_override path   {"path": "/sys/bus/pci/devices/0000:1d:00.2/driver_override"}
      2020-12-16T12:46:47.998Z        INFO    daemon.NodeConfigurator driver bind path        {"path": "/sys/bus/pci/drivers/vfio-pci/bind"}
      2020-12-16T12:46:47.999Z        INFO    daemon.NodeConfigurator.applyConfig     executing command       {"cmd": "/sriov_workdir/pf_bb_config FPGA_5GNR -c /sriov_artifacts/0000:1d:00.0.ini -p 0000:1d:00.0"}
      2020-12-16T12:46:48.017Z        INFO    daemon.NodeConfigurator.applyConfig     commands output {"output": "ERROR: Section (FLR) or name (flr_time_out) is not valid.
      FEC FPGA RTL v3.0
      UL.DL Weights = 3.3
      UL.DL Load Balance = 1
      28.128
      Queue-PF/VF Mapping Table = READY
      Ring Descriptor Size = 256 bytes
      
      --------+-----+-----+-----+-----+-----+-----+-----+-----+-----+
              |  PF | VF0 | VF1 | VF2 | VF3 | VF4 | VF5 | VF6 | VF7 |
      --------+-----+-----+-----+-----+-----+-----+-----+-----+-----+
      UL-Q'00 |     |  X  |     |     |     |     |     |     |     |
      UL-Q'01 |     |  X  |     |     |     |     |     |     |     |
      UL-Q'02 |     |  X  |     |     |     |     |     |     |     |
      UL-Q'03 |     |  X  |     |     |     |     |     |     |     |
      UL-Q'04 |     |  X  |     |     |     |     |     |     |     |
      UL-Q'05 |     |  X  |     |     |     |     |     |     |     |
      UL-Q'06 |     |  X  |     |     |     |     |     |     |     |
      UL-Q'07 |     |  X  |     |     |     |     |     |     |     |
      UL-Q'08 |     |  X  |     |     |     |     |     |     |     |
      UL-Q'09 |     |  X  |     |     |     |     |     |     |     |
      UL-Q'10 |     |  X  |     |     |     |     |     |     |     |
      UL-Q'11 |     |  X  |     |     |     |     |     |     |     |
      UL-Q'12 |     |  X  |     |     |     |     |     |     |     |
      UL-Q'13 |     |  X  |     |     |     |     |     |     |     |
      UL-Q'14 |     |  X  |     |     |     |     |     |     |     |
      UL-Q'15 |     |  X  |     |     |     |     |     |     |     |
      UL-Q'16 |     |     |  X  |     |     |     |     |     |     |
      UL-Q'17 |     |     |  X  |     |     |     |     |     |     |
      UL-Q'18 |     |     |  X  |     |     |     |     |     |     |
      UL-Q'19 |     |     |  X  |     |     |     |     |     |     |
      UL-Q'20 |     |     |  X  |     |     |     |     |     |     |
      UL-Q'21 |     |     |  X  |     |     |     |     |     |     |
      UL-Q'22 |     |     |  X  |     |     |     |     |     |     |
      UL-Q'23 |     |     |  X  |     |     |     |     |     |     |
      UL-Q'24 |     |     |  X  |     |     |     |     |     |     |
      UL-Q'25 |     |     |  X  |     |     |     |     |     |     |
      UL-Q'26 |     |     |  X  |     |     |     |     |     |     |
      UL-Q'27 |     |     |  X  |     |     |     |     |     |     |
      UL-Q'28 |     |     |  X  |     |     |     |     |     |     |
      UL-Q'29 |     |     |  X  |     |     |     |     |     |     |
      UL-Q'30 |     |     |  X  |     |     |     |     |     |     |
      UL-Q'31 |     |     |  X  |     |     |     |     |     |     |
      DL-Q'32 |     |  X  |     |     |     |     |     |     |     |
      DL-Q'33 |     |  X  |     |     |     |     |     |     |     |
      DL-Q'34 |     |  X  |     |     |     |     |     |     |     |
      DL-Q'35 |     |  X  |     |     |     |     |     |     |     |
      DL-Q'36 |     |  X  |     |     |     |     |     |     |     |
      DL-Q'37 |     |  X  |     |     |     |     |     |     |     |
      DL-Q'38 |     |  X  |     |     |     |     |     |     |     |
      DL-Q'39 |     |  X  |     |     |     |     |     |     |     |
      DL-Q'40 |     |  X  |     |     |     |     |     |     |     |
      DL-Q'41 |     |  X  |     |     |     |     |     |     |     |
      DL-Q'42 |     |  X  |     |     |     |     |     |     |     |
      DL-Q'43 |     |  X  |     |     |     |     |     |     |     |
      DL-Q'44 |     |  X  |     |     |     |     |     |     |     |
      DL-Q'45 |     |  X  |     |     |     |     |     |     |     |
      DL-Q'46 |     |  X  |     |     |     |     |     |     |     |
      DL-Q'47 |     |  X  |     |     |     |     |     |     |     |
      DL-Q'48 |     |     |  X  |     |     |     |     |     |     |
      DL-Q'49 |     |     |  X  |     |     |     |     |     |     |
      DL-Q'50 |     |     |  X  |     |     |     |     |     |     |
      DL-Q'51 |     |     |  X  |     |     |     |     |     |     |
      DL-Q'52 |     |     |  X  |     |     |     |     |     |     |
      DL-Q'53 |     |     |  X  |     |     |     |     |     |     |
      DL-Q'54 |     |     |  X  |     |     |     |     |     |     |
      DL-Q'55 |     |     |  X  |     |     |     |     |     |     |
      DL-Q'56 |     |     |  X  |     |     |     |     |     |     |
      DL-Q'57 |     |     |  X  |     |     |     |     |     |     |
      DL-Q'58 |     |     |  X  |     |     |     |     |     |     |
      DL-Q'59 |     |     |  X  |     |     |     |     |     |     |
      DL-Q'60 |     |     |  X  |     |     |     |     |     |     |
      DL-Q'61 |     |     |  X  |     |     |     |     |     |     |
      DL-Q'62 |     |     |  X  |     |     |     |     |     |     |
      DL-Q'63 |     |     |  X  |     |     |     |     |     |     |
      --------+-----+-----+-----+-----+-----+-----+-----+-----+-----+
      
      Mode of operation = VF-mode
      FPGA_5GNR PF [0000:1d:00.0] configuration complete!"}
      2020-12-16T12:46:48.017Z        INFO    daemon.NodeConfigurator.enableMasterBus executing command       {"cmd": "/usr/sbin/chroot /host/ setpci -v -s 0000:1d:00.0 COMMAND"}
      2020-12-16T12:46:48.037Z        INFO    daemon.NodeConfigurator.enableMasterBus commands output {"output": "0000:1d:00.0 @04 = 0102\n"}
      2020-12-16T12:46:48.037Z        INFO    daemon.NodeConfigurator.enableMasterBus executing command       {"cmd": "/usr/sbin/chroot /host/ setpci -v -s 0000:1d:00.0 COMMAND=0106"}
      2020-12-16T12:46:48.054Z        INFO    daemon.NodeConfigurator.enableMasterBus commands output {"output": "0000:1d:00.0 @04 0106\n"}
      2020-12-16T12:46:48.054Z        INFO    daemon.NodeConfigurator.enableMasterBus MasterBus set   {"pci": "0000:1d:00.0", "output": "0000:1d:00.0 @04 0106\n"}
      2020-12-16T12:46:48.160Z        INFO    daemon.drainhelper.Run()        worker function - end   {"performUncordon": true}
  3. Check the FEC configuration of the card:

    $ oc get sriovfecnodeconfig node1 -o yaml
    Example output

    status:
        conditions:
        - lastTransitionTime: "2020-12-15T17:19:37Z"
          message: Configured successfully
          observedGeneration: 1
          reason: ConfigurationSucceeded
          status: "True"
          type: Configured
        inventory:
          sriovAccelerators:
          - deviceID: 0d8f (1)
            driver: pci-pf-stub
            maxVirtualFunctions: 8
            pciAddress: 0000:1d:00.0
            vendorID: "8086"
          virtualFunctions:
          - deviceID: 0d90 (2)
            driver: vfio-pci
            pciAddress: 0000:1d:00.1
          - deviceID: 0d90
            driver: vfio-pci
            pciAddress: 0000:1d:00.2
    1 0d8f is the deviceID physical function of the FEC device.
    2 0d90 is the deviceID virtual function of the FEC device.

Configuring the SR-IOV-FEC Operator for the Intel vRAN Dedicated Accelerator ACC100

Programming the Intel vRAN Dedicated Accelerator ACC100 exposes the Single Root I/O Virtualization (SRIOV) virtual function (VF) devices that are then used to accelerate the FEC in the vRAN workload. The Intel vRAN Dedicated Accelerator ACC100 accelerates 4G and 5G Virtualized Radio Access Networks (vRAN) workloads. This in turn increases the overall compute capacity of a commercial, off-the-shelf platform. This device is also known as Mount Bryce.

The SR-IOV-FEC Operator handles the management of the forward error correction (FEC) devices that are used to accelerate the FEC process in vRAN L1 applications.

Configuring the SR-IOV-FEC Operator involves:

  • Creating the virtual functions (VFs) for the FEC device

  • Binding the VFs to the appropriate drivers

  • Configuring the VF queues for desired functionality in a 4G or 5G deployment

The role of forward error correction (FEC) is to correct transmission errors, where certain bits in a message can be lost or garbled. Messages can be lost or garbled due to noise in the transmission media, interference, or low signal strength. Without FEC, a garbled message would have to be resent, adding to the network load and impacting throughput and latency.

Prerequisites
  • Intel FPGA ACC100 5G/4G card

  • Node or nodes installed with the OpenNESS Operator for Wireless FEC Accelerators

  • Enable global SR-IOV and VT-d settings in the BIOS for the node

  • RT kernel configured with Performance Addon Operator

  • Log in as a user with cluster-admin privileges

Procedure
  1. Change to the vran-acceleration-operators project:

    $ oc project vran-acceleration-operators
  2. Verify that the SR-IOV-FEC Operator is installed:

    $ oc get csv -o custom-columns=Name:.metadata.name,Phase:.status.phase
    Example output

    Name                                        Phase
    sriov-fec.v1.1.0                            Succeeded
  3. Verify that the sriov-fec pods are running:

    $  oc get pods
    Example output

    NAME                                            READY       STATUS      RESTARTS    AGE
    sriov-device-plugin-j5jlv                       1/1         Running     1           15d
    sriov-fec-controller-manager-85b6b8f4d4-gd2qg   1/1         Running     1           15d
    sriov-fec-daemonset-kqqs6                       1/1         Running     1           15d
    • sriov-device-plugin expose the FEC virtual functions as resources under the node

    • sriov-fec-controller-manager applies CR to the node and maintains the operands containers

    • sriov-fec-daemonset is responsible for:

      • Discovering the SRIOV NICs on each node.

      • Syncing the status of the custom resource (CR) defined in step 6.

      • Taking the spec of the CR as input and configuring the discovered NICs.

  4. Retrieve all the nodes containing one of the supported vRAN FEC accelerator devices:

    $ oc get sriovfecnodeconfig
    Example output

    NAME             CONFIGURED
    node1            Succeeded
  5. Find the physical function (PF) of the SR-IOV FEC accelerator device to configure:

    $ oc get sriovfecnodeconfig node1 -o yaml
    Example output

    status:
        conditions:
        - lastTransitionTime: "2021-03-19T17:19:37Z"
          message: Configured successfully
          observedGeneration: 1
          reason: ConfigurationSucceeded
          status: "True"
          type: Configured
        inventory:
           sriovAccelerators:
           - deviceID: 0d5c
             driver: ""
             maxVirtualFunctions: 16
             pciAddress: 0000:af:00.0 (1)
             vendorID: "8086"
             virtualFunctions: [] (2)
    1 This field indicates the PCI address of the card.
    2 This field shows that the virtual functions are empty.
  6. Configure the number of virtual functions and queue groups on the FEC device:

    1. Create the following custom resource (CR) and save the YAML in the sriovfec_acc100cr.yaml file:

      This example configures the ACC100 8/8 queue groups for 5G, 4 queue groups for Uplink, and another 4 queue groups for Downlink.

      apiVersion: sriovfec.intel.com/v1
      kind: SriovFecClusterConfig
      metadata:
        name: config (1)
      spec:
        nodes:
         - nodeName: node1 (2)
           physicalFunctions:
             - pciAddress: 0000:af:00.0 (3)
               pfDriver: "pci-pf-stub"
               vfDriver: "vfio-pci"
               vfAmount: 16 (4)
               bbDevConfig:
                 acc100:
                   # Programming mode: 0 = VF Programming, 1 = PF Programming
                   pfMode: false
                   numVfBundles: 16
                   maxQueueSize: 1024
                   uplink4G:
                     numQueueGroups: 0
                     numAqsPerGroups: 16
                     aqDepthLog2: 4
                   downlink4G:
                    numQueueGroups: 0
                    numAqsPerGroups: 16
                    aqDepthLog2: 4
                   uplink5G:
                    numQueueGroups: 4
                    numAqsPerGroups: 16
                    aqDepthLog2: 4
                   downlink5G:
                    numQueueGroups: 4
                    numAqsPerGroups: 16
                    aqDepthLog2: 4
      1 Specify a name for the CR object. The only name that can be specified is config.
      2 Specify the node name.
      3 Specify the PCI address of the card on which the SR-IOV-FEC Operator will be installed.
      4 Specify the number of virtual functions to create. For the Intel vRAN Dedicated Accelerator ACC100, create all 16 VFs.

      The card is configured to provide up to 8 queue groups with up to 16 queues per group. The queue groups can be divided between groups allocated to 5G and 4G and Uplink and Downlink. The Intel vRAN Dedicated Accelerator ACC100 can be configured for:

      • 4G or 5G only

      • 4G and 5G at the same time

      Each configured VF has access to all the queues. Each of the queue groups have a distinct priority level. The request for a given queue group is made from the application level that is, the vRAN application leveraging the FEC device.

    2. Apply the CR:

      $ oc apply -f sriovfec_acc100cr.yaml

      After applying the CR, the SR-IOV FEC daemon starts configuring the FEC device.

Verification
  1. Check the status:

    $ oc get sriovfecclusterconfig config -o yaml
    Example output

    status:
        conditions:
        - lastTransitionTime: "2021-03-19T11:46:22Z"
          message: Configured successfully
          observedGeneration: 1
          reason: Succeeded
          status: "True"
          type: Configured
        inventory:
          sriovAccelerators:
          - deviceID: 0d5c
            driver: pci-pf-stub
            maxVirtualFunctions: 16
            pciAddress: 0000:af:00.0
            vendorID: "8086"
            virtualFunctions:
            - deviceID: 0d5d
              driver: vfio-pci
              pciAddress: 0000:b0:00.0
            - deviceID: 0d5d
              driver: vfio-pci
              pciAddress: 0000:b0:00.1
            - deviceID: 0d5d
              driver: vfio-pci
              pciAddress: 0000:b0:00.2
            - deviceID: 0d5d
              driver: vfio-pci
              pciAddress: 0000:b0:00.3
            - deviceID: 0d5d
              driver: vfio-pci
              pciAddress: 0000:b0:00.4
  2. Check the logs:

    1. Determine the pod name of the SR-IOV daemon:

      $ oc get po -o wide | grep sriov-fec-daemonset | grep node1
      Example output

      sriov-fec-daemonset-kqqs6                      1/1     Running   0          19h
    2. View the logs:

      $ oc logs sriov-fec-daemonset-kqqs6
      Example output

      {"level":"Level(-2)","ts":1616794345.4786215,"logger":"daemon.drainhelper.cordonAndDrain()","msg":"node drained"}
      {"level":"Level(-4)","ts":1616794345.4786265,"logger":"daemon.drainhelper.Run()","msg":"worker function - start"}
      {"level":"Level(-4)","ts":1616794345.5762916,"logger":"daemon.NodeConfigurator.applyConfig","msg":"current node status","inventory":{"sriovAccelerat
      ors":[{"vendorID":"8086","deviceID":"0b32","pciAddress":"0000:20:00.0","driver":"","maxVirtualFunctions":1,"virtualFunctions":[]},{"vendorID":"8086"
      ,"deviceID":"0d5c","pciAddress":"0000:af:00.0","driver":"","maxVirtualFunctions":16,"virtualFunctions":[]}]}}
      {"level":"Level(-4)","ts":1616794345.5763638,"logger":"daemon.NodeConfigurator.applyConfig","msg":"configuring PF","requestedConfig":{"pciAddress":"
      0000:af:00.0","pfDriver":"pci-pf-stub","vfDriver":"vfio-pci","vfAmount":2,"bbDevConfig":{"acc100":{"pfMode":false,"numVfBundles":16,"maxQueueSize":1
      024,"uplink4G":{"numQueueGroups":4,"numAqsPerGroups":16,"aqDepthLog2":4},"downlink4G":{"numQueueGroups":4,"numAqsPerGroups":16,"aqDepthLog2":4},"uplink5G":{"numQueueGroups":0,"numAqsPerGroups":16,"aqDepthLog2":4},"downlink5G":{"numQueueGroups":0,"numAqsPerGroups":16,"aqDepthLog2":4}}}}}
      {"level":"Level(-4)","ts":1616794345.5774765,"logger":"daemon.NodeConfigurator.loadModule","msg":"executing command","cmd":"/usr/sbin/chroot /host/ modprobe pci-pf-stub"}
      {"level":"Level(-4)","ts":1616794345.5842702,"logger":"daemon.NodeConfigurator.loadModule","msg":"commands output","output":""}
      {"level":"Level(-4)","ts":1616794345.5843055,"logger":"daemon.NodeConfigurator.loadModule","msg":"executing command","cmd":"/usr/sbin/chroot /host/ modprobe vfio-pci"}
      {"level":"Level(-4)","ts":1616794345.6090655,"logger":"daemon.NodeConfigurator.loadModule","msg":"commands output","output":""}
      {"level":"Level(-2)","ts":1616794345.6091156,"logger":"daemon.NodeConfigurator","msg":"device's driver_override path","path":"/sys/bus/pci/devices/0000:af:00.0/driver_override"}
      {"level":"Level(-2)","ts":1616794345.6091807,"logger":"daemon.NodeConfigurator","msg":"driver bind path","path":"/sys/bus/pci/drivers/pci-pf-stub/bind"}
      {"level":"Level(-2)","ts":1616794345.7488534,"logger":"daemon.NodeConfigurator","msg":"device's driver_override path","path":"/sys/bus/pci/devices/0000:b0:00.0/driver_override"}
      {"level":"Level(-2)","ts":1616794345.748938,"logger":"daemon.NodeConfigurator","msg":"driver bind path","path":"/sys/bus/pci/drivers/vfio-pci/bind"}
      {"level":"Level(-2)","ts":1616794345.7492096,"logger":"daemon.NodeConfigurator","msg":"device's driver_override path","path":"/sys/bus/pci/devices/0000:b0:00.1/driver_override"}
      {"level":"Level(-2)","ts":1616794345.7492566,"logger":"daemon.NodeConfigurator","msg":"driver bind path","path":"/sys/bus/pci/drivers/vfio-pci/bind"}
      {"level":"Level(-4)","ts":1616794345.74968,"logger":"daemon.NodeConfigurator.applyConfig","msg":"executing command","cmd":"/sriov_workdir/pf_bb_config ACC100 -c /sriov_artifacts/0000:af:00.0.ini -p 0000:af:00.0"}
      {"level":"Level(-4)","ts":1616794346.5203931,"logger":"daemon.NodeConfigurator.applyConfig","msg":"commands output","output":"Queue Groups: 0 5GUL, 0 5GDL, 4 4GUL, 4 4GDL\nNumber of 5GUL engines 8\nConfiguration in VF mode\nPF ACC100 configuration complete\nACC100 PF [0000:af:00.0] configuration complete!\n\n"}
      {"level":"Level(-4)","ts":1616794346.520459,"logger":"daemon.NodeConfigurator.enableMasterBus","msg":"executing command","cmd":"/usr/sbin/chroot /host/ setpci -v -s 0000:af:00.0 COMMAND"}
      {"level":"Level(-4)","ts":1616794346.5458736,"logger":"daemon.NodeConfigurator.enableMasterBus","msg":"commands output","output":"0000:af:00.0 @04 = 0142\n"}
      {"level":"Level(-4)","ts":1616794346.5459251,"logger":"daemon.NodeConfigurator.enableMasterBus","msg":"executing command","cmd":"/usr/sbin/chroot /host/ setpci -v -s 0000:af:00.0 COMMAND=0146"}
      {"level":"Level(-4)","ts":1616794346.5795262,"logger":"daemon.NodeConfigurator.enableMasterBus","msg":"commands output","output":"0000:af:00.0 @04 0146\n"}
      {"level":"Level(-2)","ts":1616794346.5795407,"logger":"daemon.NodeConfigurator.enableMasterBus","msg":"MasterBus set","pci":"0000:af:00.0","output":"0000:af:00.0 @04 0146\n"}
      {"level":"Level(-4)","ts":1616794346.6867144,"logger":"daemon.drainhelper.Run()","msg":"worker function - end","performUncordon":true}
      {"level":"Level(-4)","ts":1616794346.6867719,"logger":"daemon.drainhelper.Run()","msg":"uncordoning node"}
      {"level":"Level(-4)","ts":1616794346.6896322,"logger":"daemon.drainhelper.uncordon()","msg":"starting uncordon attempts"}
      {"level":"Level(-2)","ts":1616794346.69735,"logger":"daemon.drainhelper.uncordon()","msg":"node uncordoned"}
      {"level":"Level(-4)","ts":1616794346.6973662,"logger":"daemon.drainhelper.Run()","msg":"cancelling the context to finish the leadership"}
      {"level":"Level(-4)","ts":1616794346.7029872,"logger":"daemon.drainhelper.Run()","msg":"stopped leading"}
      {"level":"Level(-4)","ts":1616794346.7030034,"logger":"daemon.drainhelper","msg":"releasing the lock (bug mitigation)"}
      {"level":"Level(-4)","ts":1616794346.8040674,"logger":"daemon.updateInventory","msg":"obtained inventory","inv":{"sriovAccelerators":[{"vendorID":"8086","deviceID":"0b32","pciAddress":"0000:20:00.0","driver":"","maxVirtualFunctions":1,"virtualFunctions":[]},{"vendorID":"8086","deviceID":"0d5c","pciAddress":"0000:af:00.0","driver":"pci-pf-stub","maxVirtualFunctions":16,"virtualFunctions":[{"pciAddress":"0000:b0:00.0","driver":"vfio-pci","deviceID":"0d5d"},{"pciAddress":"0000:b0:00.1","driver":"vfio-pci","deviceID":"0d5d"}]}]}}
      {"level":"Level(-4)","ts":1616794346.9058325,"logger":"daemon","msg":"Update ignored, generation unchanged"}
      {"level":"Level(-2)","ts":1616794346.9065044,"logger":"daemon.Reconcile","msg":"Reconciled","namespace":"vran-acceleration-operators","name":"pg-itengdvs02r.altera.com"}
  3. Check the FEC configuration of the card:

    $ oc get sriovfecnodeconfig node1 -o yaml
    Example output

    status:
        conditions:
        - lastTransitionTime: "2021-03-19T11:46:22Z"
          message: Configured successfully
          observedGeneration: 1
          reason: Succeeded
          status: "True"
          type: Configured
        inventory:
          sriovAccelerators:
          - deviceID: 0d5c (1)
            driver: pci-pf-stub
            maxVirtualFunctions: 16
            pciAddress: 0000:af:00.0
            vendorID: "8086"
            virtualFunctions:
            - deviceID: 0d5d (2)
              driver: vfio-pci
              pciAddress: 0000:b0:00.0
            - deviceID: 0d5d
              driver: vfio-pci
              pciAddress: 0000:b0:00.1
            - deviceID: 0d5d
              driver: vfio-pci
              pciAddress: 0000:b0:00.2
            - deviceID: 0d5d
              driver: vfio-pci
              pciAddress: 0000:b0:00.3
            - deviceID: 0d5d
              driver: vfio-pci
              pciAddress: 0000:b0:00.4
    1 The value 0d5c is the deviceID physical function of the FEC device.
    2 The value 0d5d is the deviceID virtual function of the FEC device.

Verifying application pod access and FPGA usage on OpenNESS

OpenNESS is an edge computing software toolkit that you can use to onboard and manage applications and network functions on any type of network.

To verify all OpenNESS features are working together, including SR-IOV binding, the device plugin, Wireless Base Band Device (bbdev) configuration, and SR-IOV (FEC) VF functionality inside a non-root pod, you can build an image and run a simple validation application for the device.

For more information, go to openess.org.

Prerequisites
  • Optional: Intel FPGA PAC N3000 card

  • Node or nodes installed with the n3000-operator

  • Node or nodes installed with the SR-IOV-FEC operator

  • Real-Time kernel and huge pages configured with Performance Addon Operator

  • Log in as a user with cluster-admin privileges

Procedure
  1. Create a namespace for the test by completing the following actions:

    1. Define the test-bbdev namespace by creating a file named test-bbdev-namespace.yaml file as shown in the following example:

      apiVersion: v1
      kind: Namespace
      metadata:
        name: test-bbdev
        labels:
          openshift.io/run-level: "1"
    2. Create the namespace by running the following command:

      $ oc create -f test-bbdev-namespace.yaml
  2. Create the following Pod specification, and then save the YAML in the pod-test.yaml file:

    apiVersion: v1
    kind: Pod
    metadata:
      name: pod-bbdev-sample-app
      namespace: test-bbdev (1)
    spec:
      containers:
      - securityContext:
          privileged: false
          capabilities:
            add:
             - IPC_LOCK
             - SYS_NICE
        name: bbdev-sample-app
        image: bbdev-sample-app:1.0  (2)
        command: [ "sudo", "/bin/bash", "-c", "--" ]
        runAsUser: 0 (3)
        resources:
          requests:
            hugepages-1Gi: 4Gi (4)
            memory: 1Gi
            cpu: "4" (5)
            intel.com/intel_fec_5g: '1' (6)
            #intel.com/intel_fec_acc100: '1'
            #intel.com/intel_fec_lte: '1'
          limits:
            memory: 4Gi
            cpu: "4"
            hugepages-1Gi: 4Gi
            intel.com/intel_fec_5g: '1'
            #intel.com/intel_fec_acc100: '1'
            #intel.com/intel_fec_lte: '1
    1 Specify the namespace you created in step 1.
    2 This defines the test image containing the compiled DPDK.
    3 Make the container execute internally as the root user.
    4 Specify hugepage size hugepages-1Gi and the quantity of hugepages that will be allocated to the pod. Hugepages and isolated CPUs need to be configured using the Performance Addon Operator.
    5 Specify the number of CPUs.
    6 Testing of the N3000 5G FEC configuration is supported by intel.com/intel_fec_5g.

    To test the ACC100 configuration, uncomment intel.com/intel_fec_acc100 by removing the # symbol. To test the N3000 4G/LTE configuration, uncomment intel.com/intel_fec_lte by removing the # symbol. Only one resource can be active at any time.

  3. Create the pod:

    $ oc apply -f pod-test.yaml
  4. Check that the pod is created:

    $ oc get pods -n test-bbdev
    Example output

    NAME                                            READY           STATUS          RESTARTS        AGE
    pod-bbdev-sample-app                            1/1             Running         0               80s
  5. Use a remote shell to log in to the pod-bbdev-sample-app:

    $ oc rsh pod-bbdev-sample-app
    Example output

    sh-4.4#
  6. Print a list of environment variables:

    sh-4.4# env
    Example output

    N3000_CONTROLLER_MANAGER_METRICS_SERVICE_PORT_8443_TCP_ADDR=172.30.133.131
    SRIOV_FEC_CONTROLLER_MANAGER_METRICS_SERVICE_PORT_8443_TCP_PROTO=tcp
    DPDK_VERSION=20.11
    PCIDEVICE_INTEL_COM_INTEL_FEC_5G=0.0.0.0:1d.00.0 (1)
    ~/usr/bin/env
    HOSTNAME=fec-pod
    1 This is the PCI address of the virtual function. Depending on the resource that you requested in the pod-test.yaml file, this can be any one of following three PCI addresses:
    • PCIDEVICE_INTEL_COM_INTEL_FEC_ACC100

    • PCIDEVICE_INTEL_COM_INTEL_FEC_5G

    • PCIDEVICE_INTEL_COM_INTEL_FEC_LTE

  7. Change to the test-bbdev directory:

    sh-4.4# cd test/test-bbdev/

    The directory is in the pod and not on your local computer.

  8. Check the CPUs that are assigned to the pod:

    sh-4.4# export CPU=$(cat /sys/fs/cgroup/cpuset/cpuset.cpus)
    sh-4.4# echo ${CPU}

    This prints out the CPUs that are assigned to the fec.pod.

    Example output

    24,25,64,65
  9. Run the test-bbdev application to test the device:

    sh-4.4# ./test-bbdev.py -e="-l ${CPU} -a ${PCIDEVICE_INTEL_COM_INTEL_FEC_5G}" -c validation \ -n 64 -b 32 -l 1 -v ./test_vectors/*"
    Example output

    Executing: ../../build/app/dpdk-test-bbdev -l 24-25,64-65 0000:1d.00.0 -- -n 64 -l 1 -c validation -v ./test_vectors/bbdev_null.data -b 32
    EAL: Detected 80 lcore(s)
    EAL: Detected 2 NUMA nodes
    Option -w, --pci-whitelist is deprecated, use -a, --allow option instead
    EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
    EAL: Selected IOVA mode 'VA'
    EAL: Probing VFIO support...
    EAL: VFIO support initialized
    EAL:   using IOMMU type 1 (Type 1)
    EAL: Probe PCI driver: intel_fpga_5ngr_fec_vf (8086:d90) device: 0000:1d.00.0 (socket 1)
    EAL: No legacy callbacks, legacy socket not created
    
    
    
    ===========================================================
    Starting Test Suite : BBdev Validation Tests
    Test vector file = ldpc_dec_v7813.data
    Device 0 queue 16 setup failed
    Allocated all queues (id=16) at prio0 on dev0
    Device 0 queue 32 setup failed
    Allocated all queues (id=32) at prio1 on dev0
    Device 0 queue 48 setup failed
    Allocated all queues (id=48) at prio2 on dev0
    Device 0 queue 64 setup failed
    Allocated all queues (id=64) at prio3 on dev0
    Device 0 queue 64 setup failed
    All queues on dev 0 allocated: 64
    + ------------------------------------------------------- +
    == test: validation
    dev:0000:b0:00.0, burst size: 1, num ops: 1, op type: RTE_BBDEV_OP_LDPC_DEC
    Operation latency:
            avg: 23092 cycles, 10.0838 us
            min: 23092 cycles, 10.0838 us
            max: 23092 cycles, 10.0838 us
    TestCase [ 0] : validation_tc passed
     + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +
     + Test Suite Summary : BBdev Validation Tests
     + Tests Total :        1
     + Tests Skipped :      0
     + Tests Passed :       1 (1)
     + Tests Failed :       0
     + Tests Lasted :       177.67 ms
     + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +
    1 While some tests can be skipped, be sure that the vector tests pass.