In Linux, sysctl allows an administrator to modify kernel parameters at runtime. You can modify interface-level network sysctls using the tuning Container Network Interface (CNI) meta plugin. The tuning CNI meta plugin operates in a chain with a main CNI plugin as illustrated.
The main CNI plugin assigns the interface and passes this interface to the tuning CNI meta plugin at runtime. You can change some sysctls and several interface attributes such as promiscuous mode, all-multicast mode, MTU, and MAC address in the network namespace by using the tuning CNI meta plugin.
The following procedure configures the tuning CNI to change the interface-level network net.ipv4.conf.IFNAME.accept_redirects
sysctl. This example enables accepting and sending ICMP-redirected packets. In the tuning CNI meta plugin configuration, the interface name is represented by the IFNAME
token and is replaced with the actual name of the interface at runtime.
Create a network attachment definition, such as tuning-example.yaml
, with the following content:
apiVersion: "k8s.cni.cncf.io/v1"
kind: NetworkAttachmentDefinition
metadata:
name: <name> (1)
namespace: default (2)
spec:
config: '{
"cniVersion": "0.4.0", (3)
"name": "<name>", (4)
"plugins": [{
"type": "<main_CNI_plugin>" (5)
},
{
"type": "tuning", (6)
"sysctl": {
"net.ipv4.conf.IFNAME.accept_redirects": "1" (7)
}
}
]
}
1 | Specifies the name for the additional network attachment to create. The name must be unique within the specified namespace. |
2 | Specifies the namespace that the object is associated with. |
3 | Specifies the CNI specification version. |
4 | Specifies the name for the configuration. It is recommended to match the configuration name to the name value of the network attachment definition. |
5 | Specifies the name of the main CNI plugin to configure. |
6 | Specifies the name of the CNI meta plugin. |
7 | Specifies the sysctl to set. The interface name is represented by the IFNAME token and is replaced with the actual name of the interface at runtime. |
An example YAML file is shown here:
apiVersion: "k8s.cni.cncf.io/v1"
kind: NetworkAttachmentDefinition
metadata:
name: tuningnad
namespace: default
spec:
config: '{
"cniVersion": "0.4.0",
"name": "tuningnad",
"plugins": [{
"type": "bridge"
},
{
"type": "tuning",
"sysctl": {
"net.ipv4.conf.IFNAME.accept_redirects": "1"
}
}
]
}'
Apply the YAML by running the following command:
$ oc apply -f tuning-example.yaml
networkattachmentdefinition.k8.cni.cncf.io/tuningnad created
Create a pod such as examplepod.yaml
with the network attachment definition similar to the following:
apiVersion: v1
kind: Pod
metadata:
name: tunepod
namespace: default
annotations:
k8s.v1.cni.cncf.io/networks: tuningnad (1)
spec:
containers:
- name: podexample
image: centos
command: ["/bin/bash", "-c", "sleep INF"]
securityContext:
runAsUser: 2000 (2)
runAsGroup: 3000 (3)
allowPrivilegeEscalation: false (4)
capabilities: (5)
drop: ["ALL"]
securityContext:
runAsNonRoot: true (6)
seccompProfile: (7)
type: RuntimeDefault
1 | Specify the name of the configured NetworkAttachmentDefinition . |
2 | runAsUser controls which user ID the container is run with. |
3 | runAsGroup controls which primary group ID the containers is run with. |
4 | allowPrivilegeEscalation determines if a pod can request to allow privilege escalation. If unspecified, it defaults to true. This boolean directly controls whether the no_new_privs flag gets set on the container process. |
5 | capabilities permit privileged actions without giving full root access. This policy ensures all capabilities are dropped from the pod. |
6 | runAsNonRoot: true requires that the container will run with a user with any UID other than 0. |
7 | RuntimeDefault enables the default seccomp profile for a pod or container workload. |
Apply the yaml by running the following command:
$ oc apply -f examplepod.yaml
Verify that the pod is created by running the following command:
$ oc get pod
NAME READY STATUS RESTARTS AGE
tunepod 1/1 Running 0 47s
Log in to the pod by running the following command:
$ oc rsh tunepod
Verify the values of the configured sysctl flags. For example, find the value net.ipv4.conf.net1.accept_redirects
by running the following command:
sh-4.4# sysctl net.ipv4.conf.net1.accept_redirects
net.ipv4.conf.net1.accept_redirects = 1
You can enable all-multicast mode by using the tuning Container Network Interface (CNI) meta plugin.
The following procedure describes how to configure the tuning CNI to enable the all-multicast mode:
Create a network attachment definition, such as tuning-example.yaml
, with the following content:
apiVersion: "k8s.cni.cncf.io/v1"
kind: NetworkAttachmentDefinition
metadata:
name: <name> (1)
namespace: default (2)
spec:
config: '{
"cniVersion": "0.4.0", (3)
"name": "<name>", (4)
"plugins": [{
"type": "<main_CNI_plugin>" (5)
},
{
"type": "tuning", (6)
"allmulti": true (7)
}
}
]
}
1 | Specifies the name for the additional network attachment to create. The name must be unique within the specified namespace. |
2 | Specifies the namespace that the object is associated with. |
3 | Specifies the CNI specification version. |
4 | Specifies the name for the configuration. Match the configuration name to the name value of the network attachment definition. |
5 | Specifies the name of the main CNI plugin to configure. |
6 | Specifies the name of the CNI meta plugin. |
7 | Changes the all-multicast mode of interface. If enabled, all multicast packets on the network will be received by the interface. |
An example YAML file is shown here:
apiVersion: "k8s.cni.cncf.io/v1"
kind: NetworkAttachmentDefinition
metadata:
name: setallmulti
namespace: default
spec:
config: '{
"cniVersion": "0.4.0",
"name": "setallmulti",
"plugins": [
{
"type": "bridge"
},
{
"type": "tuning",
"allmulti": true
}
]
}'
Apply the settings specified in the YAML file by running the following command:
$ oc apply -f tuning-allmulti.yaml
networkattachmentdefinition.k8s.cni.cncf.io/setallmulti created
Create a pod with a network attachment definition similar to that specified in the following examplepod.yaml
sample file:
apiVersion: v1
kind: Pod
metadata:
name: allmultipod
namespace: default
annotations:
k8s.v1.cni.cncf.io/networks: setallmulti (1)
spec:
containers:
- name: podexample
image: centos
command: ["/bin/bash", "-c", "sleep INF"]
securityContext:
runAsUser: 2000 (2)
runAsGroup: 3000 (3)
allowPrivilegeEscalation: false (4)
capabilities: (5)
drop: ["ALL"]
securityContext:
runAsNonRoot: true (6)
seccompProfile: (7)
type: RuntimeDefault
1 | Specifies the name of the configured NetworkAttachmentDefinition . |
2 | Specifies the user ID the container is run with. |
3 | Specifies which primary group ID the containers is run with. |
4 | Specifies if a pod can request privilege escalation. If unspecified, it defaults to true . This boolean directly controls whether the no_new_privs flag gets set on the container process. |
5 | Specifies the container capabilities. The drop: ["ALL"] statement indicates that all Linux capabilities are dropped from the pod, providing a more restrictive security profile. |
6 | Specifies that the container will run with a user with any UID other than 0 . |
7 | Specifies the container’s seccomp profile. In this case, the type is set to RuntimeDefault . Seccomp is a Linux kernel feature that restricts the system calls available to a process, enhancing security by minimizing the attack surface. |
Apply the settings specified in the YAML file by running the following command:
$ oc apply -f examplepod.yaml
Verify that the pod is created by running the following command:
$ oc get pod
NAME READY STATUS RESTARTS AGE
allmultipod 1/1 Running 0 23s
Log in to the pod by running the following command:
$ oc rsh allmultipod
List all the interfaces associated with the pod by running the following command:
sh-4.4# ip link
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: eth0@if22: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 8901 qdisc noqueue state UP mode DEFAULT group default
link/ether 0a:58:0a:83:00:10 brd ff:ff:ff:ff:ff:ff link-netnsid 0 (1)
3: net1@if24: <BROADCAST,MULTICAST,ALLMULTI,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default
link/ether ee:9b:66:a4:ec:1d brd ff:ff:ff:ff:ff:ff link-netnsid 0 (2)
1 | eth0@if22 is the primary interface. |
2 | net1@if24 is the secondary interface configured with the network-attachment-definition that supports the all-multicast mode (ALLMULTI flag). |