$ oc label --all nodes color=blue
This topic serves as an alternative node upgrade method to the approach in Manual In-place Upgrades.
Blue-green deployments are a proven approach to reducing downtime caused while upgrading an environment. This is done by creating a parallel environment on which the new deployment can be installed. If a problem is detected, and after the new deployment is verified, traffic can be switched over with the option to rollback.
While blue-green is a valid strategy for deploying just about any software, there are always trade-offs. Not all environments have the same uptime requirements or the resources to properly perform blue-green deployments. In an OKD environment, the most suitable candidate for blue-green deployments are the nodes. All user processes run on these systems and even critical pieces of OKD infrastructure are self-hosted there. Uptime is most important for these workloads and the additional complexity of blue-green deployments can be justified. The exact implementation of this approach varies based on your requirements. Often the main challenge is having the excess capacity to facilitate such an approach.
After the master and etcd servers have been upgraded, you must ensure that your current production nodes are labeled either blue or green. In this example, the current installation will be blue and the new environment will be green. On each production node in your current installation:
$ oc label --all nodes color=blue
In the case of nodes requiring the uptime guarantees of a blue-green deployment,
-l flag can be used to match a subset of the environment using a selector.
Create the new green environment for any nodes that are to be replaced by
adding an equal
number of new nodes to the existing cluster. Ansible can apply the
color=green label using the
openshift_node_labels variable for each
In order to delay workload scheduling until the nodes are
be sure to set the
openshift_schedulable=false variable. After the green
nodes are in Ready state, they can be made schedulable.
Blue nodes are disabled so that no new pods are run on them:
# oadm manage-node --schedulable=true --selector=color=green # oadm manage-node --schedulable=false --selector=color=blue
A common practice is to scale the registry and router pods until they are migrated to the green nodes. For these pods, a canary deployment approach is commonly used. Scaling them up will make them immediately active on the new nodes. Pointing the deployment configuration to the new image initiates a rolling update. However, because of node anti-affinity, and the fact that the blue nodes are still unschedulable, the deployments to the old nodes will fail. At this point, the registry and router deployments can be scaled down to the original number of pods. At any given point, the original number of pods is still available so no capacity is lost.
In order for pods to be migrated from the blue environment to the green, the images must be pulled. Network latency and load on the registry can cause delays if there is not sufficient capacity built in to the environment. Often, the best way to minimize impact to the running system is to trigger new pod deployments that will land on the new nodes. Accomplish this by importing new image streams.
A major release of OKD is the motivation for a blue-green deployment. At that time, new image streams become available for users of Source-to-Image (S2I). Upon import, any builds or deployments configured with ImageChangeTriggers are automatically created.
It is important to realize that this process can trigger a large number of builds. The good news is that the builds are performed on the green nodes and, therefore, do not impact any traffic on the blue deployment.
To monitor build progress across all namespaces (projects) in the cluster:
$ oc get events -w --all-namespaces
In large environments, builds rarely completely stop. However, you should see a large increase and decrease caused by the administrative import.
Another benefit of triggering the builds is that it does a fairly good job of fetching the majority of the ancillary images to all nodes such as the various build images, the pod infrastructure image, and deployers. Everything else can be moved over using node evacuation and will proceed more quickly as a result.
For larger deployments, it is possible to have other labels that help determine how evacuation can be coordinated. The most conservative approach for avoiding downtime is to evacuate one node at a time. If services are composed of pods using zone anti-affinity, then an entire zone can be evacuated at once. It is important to ensure that the storage volumes used are available in the new zone as this detail can vary among cloud providers.
In OpenShift Origin 1.2 and later, a node evacuation is triggered whenever the service is stopped. Achieve manual evacuation and deletion of all blue nodes at once by:
# oadm manage-node --selector=color=blue --evacuate # oc delete node --selector=color=blue