Follow the guidance in Red Hat best practices for Kubernetes when developing cloud-native network functions (CNFs) to ensure that the cluster can schedule pods during an update.
Always deploy pods in groups by using |
You can configure the minimum number of pods in a deployment to allow the CNF workload to run uninterrupted by setting a pod disruption budget in a PodDisruptionBudget
custom resource (CR) that you apply.
Be careful when setting this value; setting it improperly can cause an update to fail.
For example, if you have 4 pods in a deployment and you set the pod disruption budget to 4, the cluster scheduler keeps 4 pods running at all times - no pods can be scaled down.
Instead, set the pod disruption budget to 2, letting 2 of the 4 pods be scheduled as down. Then, the worker nodes where those pods are located can be rebooted.
Setting the pod disruption budget to 2 does not mean that your deployment runs on only 2 pods for a period of time, for example, during an update. The cluster scheduler creates 2 new pods to replace the 2 older pods. However, there is short period of time between the new pods coming online and the old pods being deleted. |
High availability in Kubernetes requires duplicate processes to be running on separate nodes in the cluster.
This ensures that the application continues to run even if one node becomes unavailable.
In OKD, processes can be automatically duplicated in separate pods in a deployment.
You configure anti-affinity in the Pod
spec to ensure that the pods in a deployment do not run on the same cluster node.
During an update, setting pod anti-affinity ensures that pods are distributed evenly across nodes in the cluster. This means that node reboots are easier during an update. For example, if there are 4 pods from a single deployment on a node, and the pod disruption budget is set to only allow 1 pod to be deleted at a time, then it will take 4 times as long for that node to reboot. Setting pod anti-affinity spreads pods across the cluster to prevent such occurrences.
You can use liveness, readiness and startup probes to check the health of your live application containers before you schedule an update. These are very useful tools to use with pods that are dependent upon keeping state for their application containers.
Determines if a container is running. If the liveness probe fails for a container, the pod responds based on the restart policy.
Determines if a container is ready to accept service requests. If the readiness probe fails for a container, the kubelet removes the container from the list of available service endpoints.
A startup probe indicates whether the application within a container is started.
All other probes are disabled until the startup succeeds.
If the startup probe does not succeed, the kubelet kills the container, and the container is subject to the pod restartPolicy
setting.