×

To diagnose and resolve MetalLB configuration issues, refer to this list of commonly used commands. By using these commands, you can verify network connectivity and inspect service states to ensure efficient error recovery.

Setting the MetalLB logging levels

To manage log verbosity for the FRRouting (FRR) container, configure the logLevel specification. By adjusting this setting, you can reduce log volume from the default info level or increase detail for troubleshooting MetalLB configuration issues.

Gain a deeper insight into MetalLB by setting the logLevel to debug.

Prerequisites
  • You have access to the cluster as a user with the cluster-admin role.

  • You have installed the OpenShift CLI (oc).

Procedure
  1. Create a file, such as setdebugloglevel.yaml, with content such as the following example:

    apiVersion: metallb.io/v1beta1
    kind: MetalLB
    metadata:
      name: metallb
      namespace: metallb-system
    spec:
      logLevel: debug
      nodeSelector:
        node-role.kubernetes.io/worker: ""

    While other fields like speakerConfig can be configured here, the bgpBackend field must be omitted. It is reserved for internal use by MetalLB to manage the active BGP implementation.

  2. Apply the configuration by entering the following command:

    $ oc replace -f setdebugloglevel.yaml

    Use the oc replace command because the metallb CR was already created and you need to change only the log level.

  3. Display the names of the speaker pods:

    $ oc get -n metallb-system pods -l component=speaker
    Example output
    NAME                    READY   STATUS    RESTARTS   AGE
    speaker-2m9pm           4/4     Running   0          9m19s
    speaker-7m4qw           3/4     Running   0          19s
    speaker-szlmx           4/4     Running   0          9m19s

    Speaker and controller pods are recreated to ensure the updated logging level is applied. The logging level is modified for all the components of MetalLB.

  4. View the speaker logs:

    $ oc logs -n metallb-system speaker-7m4qw -c speaker
    Example output
    {"branch":"main","caller":"main.go:92","commit":"3d052535","goversion":"gc / go1.17.1 / amd64","level":"info","msg":"MetalLB speaker starting (commit 3d052535, branch main)","ts":"2022-05-17T09:55:05Z","version":""}
    {"caller":"announcer.go:110","event":"createARPResponder","interface":"ens4","level":"info","msg":"created ARP responder for interface","ts":"2022-05-17T09:55:05Z"}
    {"caller":"announcer.go:119","event":"createNDPResponder","interface":"ens4","level":"info","msg":"created NDP responder for interface","ts":"2022-05-17T09:55:05Z"}
    {"caller":"announcer.go:110","event":"createARPResponder","interface":"tun0","level":"info","msg":"created ARP responder for interface","ts":"2022-05-17T09:55:05Z"}
    {"caller":"announcer.go:119","event":"createNDPResponder","interface":"tun0","level":"info","msg":"created NDP responder for interface","ts":"2022-05-17T09:55:05Z"}
    I0517 09:55:06.515686      95 request.go:665] Waited for 1.026500832s due to client-side throttling, not priority and fairness, request: GET:https://172.30.0.1:443/apis/operators.coreos.com/v1alpha1?timeout=32s
    {"Starting Manager":"(MISSING)","caller":"k8s.go:389","level":"info","ts":"2022-05-17T09:55:08Z"}
    {"caller":"speakerlist.go:310","level":"info","msg":"node event - forcing sync","node addr":"10.0.128.4","node event":"NodeJoin","node name":"ci-ln-qb8t3mb-72292-7s7rh-worker-a-vvznj","ts":"2022-05-17T09:55:08Z"}
    {"caller":"service_controller.go:113","controller":"ServiceReconciler","enqueueing":"openshift-kube-controller-manager-operator/metrics","epslice":"{\"metadata\":{\"name\":\"metrics-xtsxr\",\"generateName\":\"metrics-\",\"namespace\":\"openshift-kube-controller-manager-operator\",\"uid\":\"ac6766d7-8504-492c-9d1e-4ae8897990ad\",\"resourceVersion\":\"9041\",\"generation\":4,\"creationTimestamp\":\"2022-05-17T07:16:53Z\",\"labels\":{\"app\":\"kube-controller-manager-operator\",\"endpointslice.kubernetes.io/managed-by\":\"endpointslice-controller.k8s.io\",\"kubernetes.io/service-name\":\"metrics\"},\"annotations\":{\"endpoints.kubernetes.io/last-change-trigger-time\":\"2022-05-17T07:21:34Z\"},\"ownerReferences\":[{\"apiVersion\":\"v1\",\"kind\":\"Service\",\"name\":\"metrics\",\"uid\":\"0518eed3-6152-42be-b566-0bd00a60faf8\",\"controller\":true,\"blockOwnerDeletion\":true}],\"managedFields\":[{\"manager\":\"kube-controller-manager\",\"operation\":\"Update\",\"apiVersion\":\"discovery.k8s.io/v1\",\"time\":\"2022-05-17T07:20:02Z\",\"fieldsType\":\"FieldsV1\",\"fieldsV1\":{\"f:addressType\":{},\"f:endpoints\":{},\"f:metadata\":{\"f:annotations\":{\".\":{},\"f:endpoints.kubernetes.io/last-change-trigger-time\":{}},\"f:generateName\":{},\"f:labels\":{\".\":{},\"f:app\":{},\"f:endpointslice.kubernetes.io/managed-by\":{},\"f:kubernetes.io/service-name\":{}},\"f:ownerReferences\":{\".\":{},\"k:{\\\"uid\\\":\\\"0518eed3-6152-42be-b566-0bd00a60faf8\\\"}\":{}}},\"f:ports\":{}}}]},\"addressType\":\"IPv4\",\"endpoints\":[{\"addresses\":[\"10.129.0.7\"],\"conditions\":{\"ready\":true,\"serving\":true,\"terminating\":false},\"targetRef\":{\"kind\":\"Pod\",\"namespace\":\"openshift-kube-controller-manager-operator\",\"name\":\"kube-controller-manager-operator-6b98b89ddd-8d4nf\",\"uid\":\"dd5139b8-e41c-4946-a31b-1a629314e844\",\"resourceVersion\":\"9038\"},\"nodeName\":\"ci-ln-qb8t3mb-72292-7s7rh-master-0\",\"zone\":\"us-central1-a\"}],\"ports\":[{\"name\":\"https\",\"protocol\":\"TCP\",\"port\":8443}]}","level":"debug","ts":"2022-05-17T09:55:08Z"}
  5. List the FRR-K8s pods:

    $ oc get -n openshift-frr-k8s pods
    Example output
    NAME                                      READY   STATUS    RESTARTS   AGE
    frr-k8s-bz2dn                            7/7     Running   0          4h
    frr-k8s-statuscleaner-59cf6f5d44-9wkfr   1/1     Running   0          4h
  6. View the FRR logs by specifying the frr container in one of the frr-k8s pods:

    $ oc logs -n openshift-frr-k8s <frr_k8s_pod_name> -c frr
    Example output
    2026/03/02 09:53:09 WATCHFRR: [T83RR-8SM5G] watchfrr 8.5.3 starting: vty@0
    2026/03/02 09:53:09 WATCHFRR: [ZCJ3S-SPH5S] zebra state -> down : initial connection attempt failed
    2026/03/02 09:53:09 WATCHFRR: [ZCJ3S-SPH5S] bgpd state -> down : initial connection attempt failed
    2026/03/02 09:53:09 WATCHFRR: [ZCJ3S-SPH5S] staticd state -> down : initial connection attempt failed
    2026/03/02 09:53:09 WATCHFRR: [ZCJ3S-SPH5S] bfdd state -> down : initial connection attempt failed
    2026/03/02 09:53:09 ZEBRA: [NNACN-54BDA][EC 4043309110] Disabling MPLS support (no kernel support)
    2026/03/02 09:53:09 WATCHFRR: [VTVCM-Y2NW3] Configuration Read in Took: 00:00:00
    2026/03/02 09:53:09 WATCHFRR: [QDG3Y-BY5TN] zebra state -> up : connect succeeded
    2026/03/02 09:53:09 WATCHFRR: [QDG3Y-BY5TN] bgpd state -> up : connect succeeded
    2026/03/02 09:53:09 WATCHFRR: [QDG3Y-BY5TN] staticd state -> up : connect succeeded
    2026/03/02 09:53:09 WATCHFRR: [QDG3Y-BY5TN] bfdd state -> up : connect succeeded
    2026/03/02 09:53:09 WATCHFRR: [KWE5Q-QNGFC] all daemons up, doing startup-complete notify
    2026/03/02 09:53:09 ZEBRA: [VTVCM-Y2NW3] Configuration Read in Took: 00:00:00
    2026/03/02 09:53:09 BGP: [VTVCM-Y2NW3] Configuration Read in Took: 00:00:00

FRRouting (FRR) log levels

To control the verbosity of network logs for troubleshooting or monitoring, refer to the FRRouting (FRR) logging levels.

The following values define the severity of recorded events, so that you can use them to filter output based on operational requirements:

Table 1. Log levels
Log level Description

all

Supplies all logging information for all logging levels.

debug

Information that is diagnostically helpful to people. Set to debug to give detailed troubleshooting information.

info

Provides information that always should be logged but under normal circumstances does not require user intervention. This is the default logging level.

warn

Anything that can potentially cause inconsistent MetalLB behaviour. Usually MetalLB automatically recovers from this type of error.

error

Any unrecoverable error in MetalLB. These errors usually require administrator intervention to fix.

none

Turn off all logging.

Troubleshooting fundamental concepts and responsibility

To effectively troubleshoot MetalLB, it is essential to understand the responsibilities of each component in the system. This knowledge allows you to quickly identify the source of issues and apply targeted solutions.

Before troubleshooting, identify which component is responsible for the failure:

  • The Controller: Responsible for IP address allocation. If a Service of type: LoadBalancer is stuck in Pending, check the controller.

  • The Speaker: Responsible for "attracting" traffic by announcing the assigned IP via Layer 2 (ARP/NDP) or BGP. For BGP, the speaker creates an FRRConfiguration custom resource that FRR-K8s reads. If the Service has an IP but is unreachable, check the speakers.

  • The CNI (OVN-Kubernetes): Responsible for packet delivery once it reaches the node.

Being able to reach the LoadBalancerIP from a node inside the cluster proves the CNI is working; it does not prove MetalLB is fully operational. MetalLB makes the IP reachable from outside the cluster.

OVN-Kubernetes gateway modes and when to switch

MetalLB compatibility with OVN-Kubernetes depends on the gateway mode configuration. This section explains the differences between shared and local gateway modes and provides guidance on when to switch.

A common troubleshooting point is the gateway mode. The requirement depends on your node configuration:

  • Single Interface: MetalLB functions correctly in the default Shared Gateway mode.

  • Multiple Interfaces (Common in telco environment): MetalLB requires Local Gateway mode (routingViaHost: true). In Shared Gateway mode with multiple interfaces, traffic steering does not work correctly.

To check your current configuration, run the following command:

$ oc get network.operator cluster -o jsonpath='{.spec.defaultNetwork.ovnKubernetesConfig.gatewayConfig.routingViaHost}'

If your environment uses multiple interfaces and returns false, patch to enable local gateway mode by running the following command:

$ oc patch network.operator cluster --type=merge -p '{"spec":{"defaultNetwork":{"ovnKubernetesConfig":{"gatewayConfig":{"routingViaHost": true}}}}}'

+

Applying this patch causes the OVN-Kubernetes pods to restart. Nodes do not reboot.

In BGP mode with multiple interfaces, IPv4 forwarding must be enabled on the secondary interface that receives external traffic.

+

This is only required for IPv4 traffic. IPv6 forwarding is always enabled.

The following describes the packet flow:

  1. External IPv4 traffic destined for the VIP assigned by MetalLB arrives on the secondary interface.

  2. The secondary interface must have IPv4 forwarding enabled for example, net.ipv4.conf.eth2.forwarding=1 so that the packet enters Open vSwitch (OVS).

  3. OVN-Kubernetes has rules configured to deliver the packet to the endpoint pod.

  4. The pod sends its reply. Because routingViaHost is true, the return packets pass through the kernel networking stack.

  5. The return traffic is sent back through the secondary interface only if the routing table routes the client IP address (the <source_ip> from the original packet) to the correct interface. You can verify this by running ip route get <source_ip> and confirming that the output shows the secondary interface. For more information about configuring symmetric return traffic, see Managing symmetric routing with MetalLB.

Configure per-interface forwarding by using one of the following methods:

  • For OKD 4.22 and later, use an NMState configuration to enable forwarding on the secondary interface. This is the recommended method.

  • For versions earlier than OKD 4.22, use a MachineConfig object to enable forwarding on the secondary interface.

    Enabling global IPv4 forwarding (net.ipv4.ip_forward=1) resolves the forwarding requirement but is not recommended because it enables forwarding on all interfaces rather than only the secondary interface that receives external traffic. OVN-Kubernetes does not install default DROP firewall rules on interfaces.

Troubleshooting BGP issues

To diagnose and resolve BGP configuration issues, run commands directly within the FRR container. By accessing the container, you can verify routing states and identify connectivity errors.

Prerequisites
  • You have access to the cluster as a user with the cluster-admin role.

  • You have installed the OpenShift CLI (oc).

Procedure
  1. In OKD 4.17 and later, MetalLB uses the FRR-K8s daemon for BGP. The Cluster Network Operator (CNO) deploys the FRR-K8s daemon set in the openshift-frr-k8s namespace, and the speaker pod no longer contains an FRR container. If the FRR-K8s daemon set is unavailable or its pods are unhealthy, troubleshoot the CNO deployment rather than the MetalLB Operator. Display the names of the frr-k8s pods by running the following command:

    $ oc get pods -n openshift-frr-k8s
    Example output
    NAME                                     READY   STATUS    RESTARTS   AGE
    frr-k8s-bz2dn                            7/7     Running   0          15m
    frr-k8s-statuscleaner-59cf6f5d44-9wkfr   1/1     Running   0          15m
  2. Display the running configuration for FRR by running the following command:

    $ oc exec -n openshift-frr-k8s <frr-k8s-pod> -c frr -- vtysh -c "show running-config"
    Example output
    Building configuration...
    
    Current configuration:
    !
    frr version 8.5.3
    frr defaults traditional
    hostname mysno-sno.demo.lab
    log file /etc/frr/frr.log informational
    log timestamp precision 3
    no ip forwarding
    no ipv6 forwarding
    service integrated-vtysh-config
    !
    router bgp 64501
     no bgp ebgp-requires-policy
     no bgp default ipv4-unicast
     bgp graceful-restart preserve-fw-state
     no bgp network import-check
     neighbor 192.168.122.12 remote-as 64500
     !
     address-family ipv4 unicast
      network 192.168.122.210/32
      neighbor 192.168.122.12 activate
      neighbor 192.168.122.12 route-map 192.168.122.12-in in
      neighbor 192.168.122.12 route-map 192.168.122.12-out out
     exit-address-family
    exit
    !
    ip prefix-list 192.168.122.12-inpl-ipv4 seq 1 deny any
    ip prefix-list 192.168.122.12-allowed-ipv4 seq 1 permit 192.168.122.210/32
    !
    ipv6 prefix-list 192.168.122.12-allowed-ipv6 seq 1 deny any
    ipv6 prefix-list 192.168.122.12-inpl-ipv4 seq 2 deny any
    !
    route-map 192.168.122.12-out permit 1
     match ip address prefix-list 192.168.122.12-allowed-ipv4
    exit
    !
    route-map 192.168.122.12-out permit 2
     match ipv6 address prefix-list 192.168.122.12-allowed-ipv6
    exit
    !
    route-map 192.168.122.12-in permit 3
     match ip address prefix-list 192.168.122.12-inpl-ipv4
    exit
    !
    route-map 192.168.122.12-in permit 4
     match ipv6 address prefix-list 192.168.122.12-inpl-ipv4
    exit
    !
    ip nht resolve-via-default
    !
    ipv6 nht resolve-via-default
    !
    end

    where:

    router bgp 64501

    This is the local Autonomous System Number (ASN) for your MetalLB speakers.

    neighbor 192.168.122.12 remote-as 64500

    This identifies the external BGP Peer. Specifies that a neighbor <ip-address> remote-as <peer-ASN> line exists for each BGP peer custom resource that you added. The local ASN is 64501. The remote ASN is 64500.

    network 192.168.122.210/32

    This is a specific LoadBalancer IP from your IPAddressPool. It is being advertised as a /32 (a single host route), which is standard for MetalLB.

    neighbor 192.168.122.12 activate

    Enables the exchange of IPv4 routing information with that specific neighbor.

    route-map …​ in/out

    These are "filters" or "policies." They ensure that the speaker only sends the IP addresses you have authorized and does not accidentally learn and install internal routes from your physical router.

    ip prefix-list …​ permit 192.168.122.210/32

    This creates an allowlist. Only this specific IP is permitted to be advertised.

    route-map 192.168.122.12-out permit 1

    This tells the router: "If the IP matches the prefix-list above, permit it to be sent out to the neighbor."

    ip nht resolve-via-default

    (Next hop tracking) This is a common setting in MetalLB/FRR to ensure that the BGP next-hop can be resolved using the default route if a more specific route isn’t available.

    If BFD is enabled for the BGP peer, the output includes additional bfd lines under the neighbor configuration and a bfd section at the end. The following example shows the relevant differences:

    Example output with BFD enabled
    router bgp 64501
     ...
     neighbor 192.168.122.12 remote-as 64500
     neighbor 192.168.122.12 bfd
     neighbor 192.168.122.12 bfd profile bfd-profile
     ...
    exit
    !
    ...
    !
    bfd
     profile bfd-profile
      minimum-ttl 1
     exit
     !
    exit
    !
    end
  3. Display the BGP summary by running the following command:

    $ oc exec -n openshift-frr-k8s <frr-k8s-pod> -c frr -- vtysh -c "show bgp summary"
    Example output
    IPv4 Unicast Summary (VRF default):
    BGP router identifier 192.168.122.12, local AS number 64501 vrf-id 0
    BGP table version 1
    RIB entries 1, using 192 bytes of memory
    Peers 1, using 725 KiB of memory
    
    Neighbor        V         AS   MsgRcvd   MsgSent   TblVer  InQ OutQ  Up/Down State/PfxRcd   PfxSnt Desc
    192.168.122.12  4      64500        37        38        0    0    0 00:32:12            0        1 N/A
    
    Total number of neighbors 1

    where:

    BGP router identifier 192.168.122.12

    The BGP router identifier for the node, which is typically the IP address of the primary network interface.

    local AS number 64501

    This is the ASN you assigned to MetalLB in your BGPPeer or MetalLB CR.

    Neighbor

    The IP 192.168.122.12 of your external router (the "Peer").

    AS

    The ASN of the external router (the "Peer"), which should match the remote-as value in your BGP configuration.

    Up/Down

    The shows the session has been stable for 32 minutes.

    State/PfxRcd

    This shows the session is established since it is a number, but you have received 0 routes from the peer. Seeing 0 prefixes received is perfectly normal for a standard MetalLB deployment.

    PfxSnt

    This shows that you have successfully advertised one route the LoadBalancer IP to the peer. This confirms MetalLB is doing its job. It has taken one LoadBalancer service IP and successfully told the external router: "If you want to reach this IP, send the traffic to me."

    In the output, the State is a number 0, which means the connection is successful. If the connection were broken, you would see text such as Active, Connect, or Idle here.

  4. Display the BGP peers that received an address pool by running the following command:

    $ oc exec -n openshift-frr-k8s <frr-k8s-pod> -c frr -- vtysh -c "show bgp ipv4 unicast"

    Replace ipv4 with ipv6 to display the BGP peers that received an IPv6 address pool.

    Example output
    BGP table version is 1, local router ID is 192.168.122.12, vrf id 0
    Default local pref 100, local AS 64501
    Status codes:  s suppressed, d damped, h history, * valid, > best, = multipath,
                   i internal, r RIB-failure, S Stale, R Removed
    Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
    Origin codes:  i - IGP, e - EGP, ? - incomplete
    RPKI validation codes: V valid, I invalid, N Not found
    
        Network          Next Hop            Metric LocPrf Weight Path
     *> 192.168.122.210/32
                        0.0.0.0                  0         32768 i
    
    Displayed  1 routes and 1 total paths

    where:

    local router ID 192.168.122.12

    The unique identifier for this specific OpenShift node in the BGP topology.

    local AS 64501

    The Private Autonomous System Number (ASN) assigned to your MetalLB deployment.

    *> (Status Codes)

    * indicates the route is valid. > indicates the route is selected as the best path for advertisement. MetalLB only sends best paths to peers. If > is missing, the peer will not receive the route.

    192.168.122.210/32 (Network)

    The specific IP assigned to the LoadBalancer service. The /32 mask indicates a host route, ensuring traffic for this specific IP is attracted to this node.

    0.0.0.0 (Next Hop)

    Indicates the route is local to the node. 0.0.0.0 means the current node is the egress point for this service.

    0 (Metric)

    The Multi-Exit Discriminator (MED). Used to suggest a preferred path to external neighbors. Default is 0.

    100 (LocPrf)

    Local Preference. Used to prefer an exit point for the entire AS. Default is 100.

    32768 (Weight)

    An internal FRR priority value. Locally injected routes default to 32768, ensuring the node prefers its own local service path over learned routes.

    i (Path)

    The Origin code. i (IGP) signifies the route was originated internally through the MetalLB speaker injecting the IPAddressPool into the FRR stack.

    If the table is empty, check if the Service has an IP assigned using the oc get svc command. Verify that a BGPAdvertisement exists and that its nodeSelector or labelSelector matches the service and the node you are running the command on. If the Next Hop is not 0.0.0.0, the node might be trying to forward the traffic elsewhere before it even reaches the pods, which can indicate a complex BGP peering issue or an issue with the node underlying routing table.

Verification

To confirm that BGP is functioning correctly, verify that all of the following conditions are met:

  • The show running-config output contains a router bgp section with the correct local ASN and at least one neighbor entry with the expected peer IP and remote ASN.

  • The show bgp summary output shows the BGP session as Established. The State/PfxRcd column displays a number, such as 0, rather than a state name such as Active, Connect, or Idle.

  • The PfxSnt column in the BGP summary shows at least 1, which confirms that MetalLB is advertising a LoadBalancer IP to the peer.

  • The show bgp ipv4 unicast output contains at least one route with the *> status code, which indicates that the route is valid and selected as the best path for advertisement.

  • If BFD is configured, the show running-config output includes neighbor <ip_address> bfd lines and a bfd profile section.

Troubleshooting BFD issues

To diagnose and resolve Bidirectional Forwarding Detection (BFD) issues, run commands directly within the FRRouting (FRR) container. By accessing the container, you can verify that BFD peers are correctly configured with established BGP sessions.

The BFD implementation that Red Hat supports uses FRRouting (FRR) in a container that exists in an frr-k8s pod in the openshift-frr-k8s namespace.

Prerequisites
  • You have access to the cluster as a user with the cluster-admin role.

  • You have installed the OpenShift CLI (oc).

Procedure
  1. Display the names of the FRR-K8s pods by running the following command:

    $ oc get pods -n openshift-frr-k8s -l app=frr-k8s
    Example output
    NAME            READY   STATUS    RESTARTS   AGE
    frr-k8s-bz2dn   7/7     Running   0          106m
  2. Run the following command against the frr container in the FRR-K8s pod to display the BFD peers:

    $ oc exec -n openshift-frr-k8s <frr_k8s_pod_name> -c frr -- vtysh -c "show bfd peers brief"
    Example output
    Session count: 1
    SessionId  LocalAddress              PeerAddress              Status
    =========  ============              ===========              ======
    3909139637 10.0.1.2                  10.0.2.3                 up

    where:

    up

    Specifies that the PeerAddress column includes each BFD peer. If the output does not list a BFD peer IP address that you expected the output to include, troubleshoot BGP connectivity with the peer. If the status field indicates down, check for connectivity on the links and equipment between the node and the peer. You can determine the node name for the FRR-K8s pod with a command such as oc get pods -n openshift-frr-k8s <frr_k8s_pod_name> -o jsonpath='{.spec.nodeName}'.

  3. Optional: To display detailed BFD peer information, run the following command:

    $ oc exec -n openshift-frr-k8s <frr_k8s_pod_name> -c frr -- vtysh -c "show bfd peers"
    Example output
    BFD Peers:
    	peer 10.0.2.3 local-address 10.0.1.2 vrf default interface br-ex
    		ID: 3909139637
    		Remote ID: 2819913327
    		Active mode
    		Status: up
    		Uptime: 1 hour(s), 12 minute(s), 30 second(s)
    		Diagnostics: ok
    		Remote diagnostics: ok
    		Peer Type: dynamic
    		RTT min/avg/max: 301/512/4191 usec
    		Local timers:
    			Detect-multiplier: 3
    			Receive interval: 300ms
    			Transmission interval: 300ms
    			Echo receive interval: 50ms
    			Echo transmission interval: disabled
    		Remote timers:
    			Detect-multiplier: 3
    			Receive interval: 300ms
    			Transmission interval: 300ms
    			Echo receive interval: disabled

    where:

    Status

    The current state of the BFD session. A value of up indicates that the session is established and the link is healthy. A value of down indicates that the session has failed.

    Uptime or Downtime

    The duration that the session has been in its current state.

    Remote ID

    The session ID assigned by the remote peer. A value of 0 indicates that the remote peer has not responded.

    Diagnostics and Remote diagnostics

    The diagnostic codes for the local and remote peers. A value of ok indicates no errors.

    Detect-multiplier

    The number of missed packets before the session is declared down. With a receive interval of 300ms and a detect-multiplier of 3, the session is declared down after 900ms of missed packets.

Verification

To confirm that BFD is functioning correctly, verify that all of the following conditions are met:

  • The show bfd peers brief output lists each expected BFD peer with a Status of up.

  • The show bfd peers detailed output shows a nonzero Remote ID, which confirms that the remote peer has responded.

  • The Diagnostics and Remote diagnostics fields both display ok.

  • If a session shows down, check the Downtime duration and Remote diagnostics field for error codes, and verify network connectivity between the node and the peer.

MetalLB metrics for BGP and BFD

To monitor network connectivity and diagnose routing states, refer to the Prometheus metrics for MetalLB. These metrics provide visibility into the status of BGP peers and BFD profiles so that you can ensure stable external communication.

Table 2. MetalLB BFD metrics
Name Description

frrk8s_bfd_control_packet_input

Counts the number of BFD control packets received from each BFD peer.

frrk8s_bfd_control_packet_output

Counts the number of BFD control packets sent to each BFD peer.

frrk8s_bfd_echo_packet_input

Counts the number of BFD echo packets received from each BFD peer.

frrk8s_bfd_echo_packet_output

Counts the number of BFD echo packets sent to each BFD.

frrk8s_bfd_session_down_events

Counts the number of times the BFD session with a peer entered the down state.

frrk8s_bfd_session_up

Indicates the connection state with a BFD peer. 1 indicates the session is up and 0 indicates the session is down.

frrk8s_bfd_session_up_events

Counts the number of times the BFD session with a peer entered the up state.

frrk8s_bfd_zebra_notifications

Counts the number of BFD Zebra notifications for each BFD peer.

Table 3. MetalLB BGP metrics
Name Description

frrk8s_bgp_announced_prefixes_total

Counts the number of load balancer IP address prefixes that are advertised to BGP peers. The terms prefix and aggregated route have the same meaning.

frrk8s_bgp_session_up

Indicates the connection state with a BGP peer. 1 indicates the session is up and 0 indicates the session is down.

frrk8s_bgp_updates_total

Counts the number of BGP update messages sent to each BGP peer.

frrk8s_bgp_opens_sent

Counts the number of BGP open messages sent to each BGP peer.

frrk8s_bgp_opens_received

Counts the number of BGP open messages received from each BGP peer.

frrk8s_bgp_notifications_sent

Counts the number of BGP notification messages sent to each BGP peer.

frrk8s_bgp_updates_total_received

Counts the number of BGP update messages received from each BGP peer.

frrk8s_bgp_keepalives_sent

Counts the number of BGP keepalive messages sent to each BGP peer.

frrk8s_bgp_keepalives_received

Counts the number of BGP keepalive messages received from each BGP peer.

frrk8s_bgp_route_refresh_sent

Counts the number of BGP route refresh messages sent to each BGP peer.

frrk8s_bgp_total_sent

Counts the number of total BGP messages sent to each BGP peer.

frrk8s_bgp_total_received

Counts the number of total BGP messages received from each BGP peer.

frrk8s_bgp_received_prefixes_total

Counts the number of load balancer IP address prefixes received from each BGP peer.

About collecting MetalLB data

To collect diagnostic data for debugging or support analysis, run the oc adm must-gather CLI command. This utility captures essential information regarding the cluster, the MetalLB configuration, and the MetalLB Operator state.

The following list details features and objects related to MetalLB and the MetalLB Operator:

  • The namespace and child objects where you deploy the MetalLB Operator

  • All MetalLB Operator custom resource definitions (CRDs)

The command collects the following information from FRRouting (FRR), which Red Hat uses to implement BGP and BFD:

  • /etc/frr/frr.conf

  • /etc/frr/frr.log

  • /etc/frr/daemons configuration file

  • /etc/frr/vtysh.conf

The command collects log and configuration files from the frr container that exists in each frr-k8s pod in the openshift-frr-k8s namespace. Additionally, the command collects the output from the following vtysh commands:

  • show running-config

  • show bgp ipv4

  • show bgp ipv6

  • show bgp neighbor

  • show bfd peer

No additional configuration is required when you run the command.