pod_network_name_info{interface="net0",namespace="namespacename",network_name="nadnamespace/firstNAD",pod="podname"} 0
Administrators can use the pod_network_info
metric to classify and monitor secondary network interfaces. The metric does this by adding a label that identifies the interface type, typically based on the associated NetworkAttachmentDefinition
resource.
Secondary devices, or interfaces, are used for different purposes. Metrics from secondary network interfaces need to be classified to allow for effective aggregation and monitoring.
Exposed metrics contain the interface but do not specify where the interface originates. This is workable when there are no additional interfaces. However, relying on interface names alone becomes problematic when secondary interfaces are added because it is difficult to identify their purpose and use their metrics effectively..
When adding secondary interfaces, their names depend on the order in which they are added. Secondary interfaces can belong to distinct networks that can each serve a different purposes.
With pod_network_name_info
it is possible to extend the current metrics with additional information that identifies the interface type. In this way, it is possible to aggregate the metrics and to add specific alarms to specific interface types.
The network type is generated from the name of the NetworkAttachmentDefinition
resource, which distinguishes different secondary network classes. For example, different interfaces belonging to different networks or using different CNIs use different network attachment definition names.
The Network Metrics Daemon is a daemon component that collects and publishes network related metrics.
The kubelet is already publishing network related metrics you can observe. These metrics are:
container_network_receive_bytes_total
container_network_receive_errors_total
container_network_receive_packets_total
container_network_receive_packets_dropped_total
container_network_transmit_bytes_total
container_network_transmit_errors_total
container_network_transmit_packets_total
container_network_transmit_packets_dropped_total
The labels in these metrics contain, among others:
Pod name
Pod namespace
Interface name (such as eth0
)
These metrics work well until new interfaces are added to the pod, for example via Multus, as it is not clear what the interface names refer to.
The interface label refers to the interface name, but it is not clear what that interface is meant for. In case of many different interfaces, it would be impossible to understand what network the metrics you are monitoring refer to.
This is addressed by introducing the new pod_network_name_info
described in the following section.
The Network Metrics daemonset publishes a pod_network_name_info
gauge metric, with a fixed value of 0
.
pod_network_name_info
pod_network_name_info{interface="net0",namespace="namespacename",network_name="nadnamespace/firstNAD",pod="podname"} 0
The network name label is produced using the annotation added by Multus. It is the concatenation of the namespace the network attachment definition belongs to, plus the name of the network attachment definition.
The new metric alone does not provide much value, but combined with the network related container_network_*
metrics, it offers better support for monitoring secondary networks.
Using a promql
query like the following ones, it is possible to get a new metric containing the value and the network name retrieved from the k8s.v1.cni.cncf.io/network-status
annotation:
(container_network_receive_bytes_total) + on(namespace,pod,interface) group_left(network_name) ( pod_network_name_info )
(container_network_receive_errors_total) + on(namespace,pod,interface) group_left(network_name) ( pod_network_name_info )
(container_network_receive_packets_total) + on(namespace,pod,interface) group_left(network_name) ( pod_network_name_info )
(container_network_receive_packets_dropped_total) + on(namespace,pod,interface) group_left(network_name) ( pod_network_name_info )
(container_network_transmit_bytes_total) + on(namespace,pod,interface) group_left(network_name) ( pod_network_name_info )
(container_network_transmit_errors_total) + on(namespace,pod,interface) group_left(network_name) ( pod_network_name_info )
(container_network_transmit_packets_total) + on(namespace,pod,interface) group_left(network_name) ( pod_network_name_info )
(container_network_transmit_packets_dropped_total) + on(namespace,pod,interface) group_left(network_name)