A pod restart policy determines how OKD responds when Containers in that pod exit.
The policy applies to all Containers in that pod.
-
Always - Tries restarting a successfully exited Container on the pod continuously, with an exponential back-off delay (10s, 20s, 40s) capped at 5 minutes. The default is Always.
-
OnFailure - Tries restarting a failed Container on the pod with an exponential back-off delay (10s, 20s, 40s) capped at 5 minutes.
-
Never - Does not try to restart exited or failed Containers on the pod. Pods immediately fail and exit.
After the pod is bound to a node, the pod will never be bound to another node. This means that a controller is necessary in order for a pod to survive node failure:
| Condition |
Controller Type |
Restart Policy |
Pods that are expected to terminate (such as batch computations) |
Job |
OnFailure or Never
|
Pods that are expected to not terminate (such as web servers) |
Replication controller |
Always.
|
Pods that must run one-per-machine |
Daemon set |
Any |
If a Container on a pod fails and the restart policy is set to OnFailure, the pod stays on the node and the Container is restarted. If you do not want the Container to
restart, use a restart policy of Never.
If an entire pod fails, OKD starts a new pod. Developers must address the possibility that applications might be restarted in a new pod. In particular,
applications must handle temporary files, locks, incomplete output, and so forth caused by previous runs.
|
|
Kubernetes architecture expects reliable endpoints from cloud providers. When a cloud provider is down, the kubelet prevents OKD from restarting.
If the underlying cloud provider endpoints are not reliable, do not install a cluster using cloud provider integration. Install the cluster as if it was in a no-cloud environment. It is not recommended to toggle cloud provider integration on or off in an installed cluster.
|
For details on how OKD uses restart policy with failed Containers, see
the Example States in the Kubernetes documentation.