Running Installation Playbooks | Installing Clusters

Before Initiating Installation
- Cloud installation
Running the Installation Playbooks
Verifying the Installation
Optionally Securing Builds
Uninstalling OKD
- Uninstalling Nodes
Known Issues
What’s Next?

To install a OKD cluster, you run a series of Ansible playbooks.

Running Ansible playbooks with the --tags or --check options is not supported by Red Hat.

Before Initiating Installation

Before installing OKD, you must first:

See the Prerequisites and Host Preparation topics to prepare your hosts. This includes verifying system and environment requirements per component type and properly installing and configuring the docker service. It also includes installing Ansible version 2.4 or later, as the installation method is based on Ansible playbooks and as such requires directly invoking Ansible.

See the Configuring Your Inventory File topic to define your environment and desired OKD cluster configuration. This inventory file will be used to initiate the installation, and should be saved and maintained for future cluster upgrades as well.

Starting in OKD 3.10, setting openshift_node_group_name per host to a node group is required for all cluster installations whether you are using the default node group definitions and ConfigMaps or are customizing your own. See Defining Node Groups and Host Mappings for more details if you have not set them yet.

If you are interested in installing OKD using the system container method (required for RHEL Atomic Host systems), see RPM Versus System Container Considerations to ensure that you understand the differences between these methods, then return to this topic to continue.

For large-scale installs, including suggestions for optimizing install time, see the Scaling and Performance Guide.

To alternatively install OKD solely as a stand-alone registry, see Installing a Stand-alone Registry.

Cloud installation

OKD VMs can be provisioned in a cloud environment. You can use Ansible playbooks to automate defining of your cloud hosted infrastructure and applying post-provision configuration for the supported cloud providers.

OpenStack provider

As an alternate, you can install OKD using the OpenStack CLI. For more information, See the reference architecture for OKD 3.6 and Red Hat OpenStack Platform 10.

As a prerequisite to using the OpenStack CLI, first provision VMs and configure the cloud infrastructure, such as networking, storage, firewall, and security groups. For information on these configuration tasks using the reference architecture, see the cloud provider considerations and Ansible playbooks to automate it. See also Configuring for OpenStack and Configuring Your Inventory File.

Running the Installation Playbooks

The installer uses modularized playbooks allowing administrators to install specific components as needed. By breaking up the roles and playbooks, there is better targeting of ad hoc administration tasks. This results in an increased level of control during installations and results in time savings. The playbooks and their ordering are detailed below in Running Individual Component Playbooks.

While RHEL Atomic Host is supported for running OKD services as system container, the installation method utilizes Ansible, which is not available in RHEL Atomic Host. The RPM-based installer must therefore be run from a supported version of Fedora, CentOS, or RHEL. The host initiating the installation does not need to be intended for inclusion in the OKD cluster, but it can be. Alternatively, a containerized version of the installer is available as a system container, which can be run from a RHEL Atomic Host system.

After you have configured Ansible by defining an inventory file in /etc/ansible/hosts, run the installation playbook via Ansible using either the RPM-based or containerized installer.

Due to a known issue, after running the installation, if NFS volumes are provisioned for any component, the following directories might be created whether their components are being deployed to NFS volumes or not:

/exports/logging-es
/exports/logging-es-ops/
/exports/metrics/
/exports/prometheus
/exports/prometheus-alertbuffer/
/exports/prometheus-alertmanager/

You can delete these directories after installation, as needed.

Running the RPM-based Installer

The RPM-based installer uses Ansible installed via RPM packages to run playbooks and configuration files available on the local host.

Do not run OpenShift Ansible playbooks under nohup. Using nohup with the playbooks causes file descriptors to be created and not closed. Therefore, the system can run out of files to open and the playbook will fail.

To run the RPM-based installer:

Run the prerequisites.yml playbook. This playbook installs required software packages, if any, and modifies the container runtimes. Unless you need to configure the container runtimes, run this playbook only once, before you deploy a cluster the first time:
```
# ansible-playbook [-i /path/to/inventory] \
    ~/openshift-ansible/playbooks/prerequisites.yml
```
1 If your inventory file is not in the /etc/ansible/hosts directory, specify -i and the path to the inventory file.

Run the deploy_cluster.yml playbook to initiate the cluster installation:

# ansible-playbook [-i /path/to/inventory] \
    ~/openshift-ansible/playbooks/deploy_cluster.yml

If for any reason the installation fails, before re-running the installer, see Known Issues to check for any specific instructions or workarounds.

The installer caches playbook configuration values for 10 minutes, by default. If you change any system, network, or inventory configuration, and then re-run the installer within that 10 minute period, the new values are not used, and the previous values are used instead. You can delete the contents of the cache, which is defined by the fact_caching_connection value in the /etc/ansible/ansible.cfg file. An example of this file is shown in Recommended Installation Practices.

Running the Containerized Installer

The openshift/origin-ansible image is a containerized version of the OKD installer. This installer image provides the same functionality as the RPM-based installer, but it runs in a containerized environment that provides all of its dependencies rather than being installed directly on the host. The only requirement to use it is the ability to run a container.

Running the Installer as a System Container

The installer image can be used as a system container. System containers are stored and run outside of the traditional docker service. This enables running the installer image from one of the target hosts without concern for the install restarting docker on the host.

To use the Atomic CLI to run the installer as a run-once system container, perform the following steps as the root user:

Run the prerequisites.yml playbook:

# atomic install --system \
    --storage=ostree \
    --set INVENTORY_FILE=/path/to/inventory \ (1)
    --set PLAYBOOK_FILE=/usr/share/ansible/openshift-ansible/playbooks/prerequisites.yml \
    --set OPTS="-v" \
    docker.io/openshift/origin-ansible:v3.10

1	Specify the location on the local host for your inventory file.

This command runs a set of prerequiste tasks by using the inventory file specified and the root user’s SSH configuration.

Run the deploy_cluster.yml playbook:
```
# atomic install --system \
    --storage=ostree \
    --set INVENTORY_FILE=/path/to/inventory \ (1)
    --set PLAYBOOK_FILE=/usr/share/ansible/openshift-ansible/playbooks/deploy_cluster.yml \
    --set OPTS="-v" \
    docker.io/openshift/origin-ansible:v3.10
```
1 Specify the location on the local host for your inventory file.

This command initiates the cluster installation by using the inventory file specified and the root user’s SSH configuration. It logs the output on the terminal and also saves it in the /var/log/ansible.log file. The first time this command is run, the image is imported into OSTree storage (system containers use this rather than docker daemon storage). On subsequent runs, it reuses the stored image.

If for any reason the installation fails, before re-running the installer, see Known Issues to check for any specific instructions or workarounds.

Running Other Playbooks

You can use the PLAYBOOK_FILE environment variable to specify other playbooks you want to run by using the containerized installer. The default value of the PLAYBOOK_FILE is /usr/share/ansible/openshift-ansible/playbooks/deploy_cluster.yml, which is the main cluster installation playbook, but you can set it to the path of another playbook inside the container.

For example, to run the pre-install checks playbook before installation, use the following command:

# atomic install --system \
    --storage=ostree \
    --set INVENTORY_FILE=/path/to/inventory \
    --set PLAYBOOK_FILE=/usr/share/ansible/openshift-ansible/playbooks/openshift-checks/pre-install.yml \ (1)
    --set OPTS="-v" \ (2)
    docker.io/openshift/origin-ansible:v3.10

1	Set `PLAYBOOK_FILE` to the full path of the playbook starting at the *playbooks/* directory. Playbooks are located in the same locations as with the RPM-based installer.
2	Set `OPTS` to add command line options to `ansible-playbook`.

Running the Installer as a Docker Container

The installer image can also run as a docker container anywhere that docker can run.

This method must not be used to run the installer on one of the hosts being configured, as the install may restart docker on the host, disrupting the installer container execution.

Although this method and the system container method above use the same image, they run with different entry points and contexts, so runtime parameters are not the same.

At a minimum, when running the installer as a docker container you must provide:

SSH key(s), so that Ansible can reach your hosts.
An Ansible inventory file.
The location of the Ansible playbook to run against that inventory.

Here is an example of how to run an install via docker, which must be run by a non-root user with access to docker:

First, run the prerequisites.yml playbook:

$ docker run -t -u `id -u` \ (1)
    -v $HOME/.ssh/id_rsa:/opt/app-root/src/.ssh/id_rsa:Z \ (2)
    -v $HOME/ansible/hosts:/tmp/inventory:Z \ (3)
    -e INVENTORY_FILE=/tmp/inventory \ (3)
    -e PLAYBOOK_FILE=playbooks/prerequisites.yml \ (4)
    -e OPTS="-v" \ (5)
    docker.io/openshift/origin-ansible:v3.10

1	-u `id -u` makes the container run with the same UID as the current user, which allows that user to use the SSH key inside the container (SSH private keys are expected to be readable only by their owner).
2	`-v $HOME/.ssh/id_rsa:/opt/app-root/src/.ssh/id_rsa:Z` mounts your SSH key (`$HOME/.ssh/id_rsa`) under the container user’s `$HOME/.ssh` (*/opt/app-root/src* is the `$HOME` of the user in the container). If you mount the SSH key into a non-standard location you can add an environment variable with `-e ANSIBLE_PRIVATE_KEY_FILE=/the/mount/point` or set `ansible_ssh_private_key_file=/the/mount/point` as a variable in the inventory to point Ansible at it. Note that the SSH key is mounted with the `:Z` flag. This is required so that the container can read the SSH key under its restricted SELinux context. This also means that your original SSH key file will be re-labeled to something like `system_u:object_r:container_file_t:s0:c113,c247`. For more details about `:Z`, check the `docker-run(1)` man page. Keep this in mind when providing these volume mount specifications because this might have unexpected consequences: for example, if you mount (and therefore re-label) your whole `$HOME/.ssh` directory it will block the host’s sshd from accessing your public keys to login. For this reason you may want to use a separate copy of the SSH key (or directory), so that the original file labels remain untouched.
3	`-v $HOME/ansible/hosts:/tmp/inventory:Z` and `-e INVENTORY_FILE=/tmp/inventory` mount a static Ansible inventory file into the container as */tmp/inventory* and set the corresponding environment variable to point at it. As with the SSH key, the inventory file SELinux labels may need to be relabeled by using the `:Z` flag to allow reading in the container, depending on the existing label (for files in a user `$HOME` directory this is likely to be needed). So again you may prefer to copy the inventory to a dedicated location before mounting it. The inventory file can also be downloaded from a web server if you specify the `INVENTORY_URL` environment variable, or generated dynamically using `DYNAMIC_SCRIPT_URL` to specify an executable script that provides a dynamic inventory.
4	`-e PLAYBOOK_FILE=playbooks/prerequisites.yml` specifies the playbook to run (in this example, the prereqsuites playbook) as a relative path from the top level directory of openshift-ansible content. The full path from the RPM can also be used, as well as the path to any other playbook file in the container.
5	`-e OPTS="-v"` supplies arbitrary command line options (in this case, `-v` to increase verbosity) to the `ansible-playbook` command that runs inside the container.

Next, run the deploy_cluster.yml playbook to initiate the cluster installation:

$ docker run -t -u `id -u` \
    -v $HOME/.ssh/id_rsa:/opt/app-root/src/.ssh/id_rsa:Z \
    -v $HOME/ansible/hosts:/tmp/inventory:Z \
    -e INVENTORY_FILE=/tmp/inventory \
    -e PLAYBOOK_FILE=playbooks/deploy_cluster.yml \
    -e OPTS="-v" \
    docker.io/openshift/origin-ansible:v3.10

Running the Installation Playbook for OpenStack

To install OKD on an existing OpenStack installation, use the OpenStack playbook. For more information about the playbook, including detailed prerequisites, see the OpenStack Provisioning readme file.

To run the playbook, run the following command:

$ ansible-playbook --user openshift \
  -i openshift-ansible/playbooks/openstack/inventory.py \
  -i inventory \
  openshift-ansible/playbooks/openstack/openshift-cluster/provision_install.yml

Running Individual Component Playbooks

The main installation playbook ~/openshift-ansible/playbooks/deploy_cluster.yml runs a set of individual component playbooks in a specific order, and the installer reports back at the end what phases you have gone through. If the installation fails, you are notified which phase failed along with the errors from the Ansible run.

After you resolve the errors, you can continue installation:

You can run the remaining individual installation playbooks.
If you are installing in a new environment, you can run the deploy_cluster.yml playbook again.

If you want to run only the remaining playbooks, start by running the playbook for the phase that failed and then run each of the remaining playbooks in order:

# ansible-playbook [-i /path/to/inventory] <playbook_file_location>

The following table lists the playbooks in the order that they must run:

Table 1. Individual Component Playbook Run Order
Playbook Name	File Location
Health Check	*~/openshift-ansible/playbooks/openshift-checks/pre-install.yml*
Node Bootstrap	*~/openshift-ansible/playbooks/openshift-node/bootstrap.yml*
etcd Install	*~/openshift-ansible/playbooks/openshift-etcd/config.yml*
NFS Install	*~/openshift-ansible/playbooks/openshift-nfs/config.yml*
Load Balancer Install	*~/openshift-ansible/playbooks/openshift-loadbalancer/config.yml*
Master Install	*~/openshift-ansible/playbooks/openshift-master/config.yml*
Master Additional Install	*~/openshift-ansible/playbooks/openshift-master/additional_config.yml*
Node Join	*~/openshift-ansible/playbooks/openshift-node/join.yml*
GlusterFS Install	*~/openshift-ansible/playbooks/openshift-glusterfs/config.yml*
Hosted Install	*~/openshift-ansible/playbooks/openshift-hosted/config.yml*
Monitoring Install	*~/openshift-ansible/playbooks/openshift-monitoring/config.yml*
Web Console Install	*~/openshift-ansible/playbooks/openshift-web-console/config.yml*
Metrics Install	*~/openshift-ansible/playbooks/openshift-metrics/config.yml*
Logging Install	*~/openshift-ansible/playbooks/openshift-logging/config.yml*
Prometheus Install	*~/openshift-ansible/playbooks/openshift-prometheus/config.yml*
Availability Monitoring Install	*~/openshift-ansible/playbooks/openshift-monitor-availability/config.yml*
Service Catalog Install	*~/openshift-ansible/playbooks/openshift-service-catalog/config.yml*
Management Install	*~/openshift-ansible/playbooks/openshift-management/config.yml*
Descheduler Install	*~/openshift-ansible/playbooks/openshift-descheduler/config.yml*
Node Problem Detector Install	*~/openshift-ansible/playbooks/openshift-node-problem-detector/config.yml*
Autoheal Install	*~/openshift-ansible/playbooks/openshift-autoheal/config.yml*

Verifying the Installation

After the installation completes:

Verify that the master is started and nodes are registered and reporting in Ready status. On the master host, run the following as root:

# oc get nodes
NAME                   STATUS    ROLES     AGE       VERSION
master.example.com     Ready     master    7h        v1.9.1+a0ce1bc657
node1.example.com      Ready     compute   7h        v1.9.1+a0ce1bc657
node2.example.com      Ready     compute   7h        v1.9.1+a0ce1bc657

To verify that the web console is installed correctly, use the master host name and the web console port number to access the web console with a web browser.

For example, for a master host with a host name of master.openshift.com and using the default port of 8443, the web console would be found at https://master.openshift.com:8443/console.

Verifying Multiple etcd Hosts

If you installed multiple etcd hosts:

First, verify that the etcd package, which provides the etcdctl command, is installed:
```
# yum install etcd
```

On a master host, verify the etcd cluster health, substituting for the FQDNs of your etcd hosts in the following:

# etcdctl -C \
    https://etcd1.example.com:2379,https://etcd2.example.com:2379,https://etcd3.example.com:2379 \
    --ca-file=/etc/origin/master/master.etcd-ca.crt \
    --cert-file=/etc/origin/master/master.etcd-client.crt \
    --key-file=/etc/origin/master/master.etcd-client.key cluster-health

Also verify the member list is correct:

# etcdctl -C \
    https://etcd1.example.com:2379,https://etcd2.example.com:2379,https://etcd3.example.com:2379 \
    --ca-file=/etc/origin/master/master.etcd-ca.crt \
    --cert-file=/etc/origin/master/master.etcd-client.crt \
    --key-file=/etc/origin/master/master.etcd-client.key member list

Verifying Multiple Masters Using HAProxy

If you installed multiple masters using HAProxy as a load balancer, browse to the following URL according to your [lb] section definition and check HAProxy’s status:

http://<lb_hostname>:9000

You can verify your installation by consulting the HAProxy Configuration documentation.

Optionally Securing Builds

Running docker build is a privileged process, so the container has more access to the node than might be considered acceptable in some multi-tenant environments. If you do not trust your users, you can use a more secure option at the time of installation. Disable Docker builds on the cluster and require that users build images outside of the cluster. See Securing Builds by Strategy for more information on this optional process.

Uninstalling OKD

You can uninstall OKD hosts in your cluster by running the uninstall.yml playbook. This playbook deletes OKD content installed by Ansible, including:

Configuration
Containers
Default templates and image streams
Images
RPM packages

The playbook will delete content for any hosts defined in the inventory file that you specify when running the playbook.

Before you uninstall your cluster, review the following list of scenarios and make sure that uninstalling is the best option:

If your installation process failed and you want to continue the process, you can retry the installation. The installation playbooks are designed so that if they fail to install your cluster, you can run them again without needing to uninstall the cluster.
If you want to restart a failed installation from the beginning, you can uninstall the OKD hosts in your cluster by running the uninstall.yml playbook, as described in the following section. This playbook only uninstalls the OKD assets for the most recent version that you installed.
If you must change the host names or certificate names, you must recreate your certificates before retrying installation by running the uninstall.yml playbook. Running the installation playbooks again will not recreate the certificates.
If you want to repurpose hosts that you installed OKD on earlier, such as with a proof-of-concept installation, or want to install a different minor or asynchronous version of OKD you must reimage the hosts before you use them in a production cluster. After you run the uninstall.yml playbooks, some host assets might remain in an altered state.

If you want to uninstall OKD across all hosts in your cluster, run the playbook using the inventory file you used when installing OKD initially or ran most recently:

# ansible-playbook [-i /path/to/file] \
    ~/openshift-ansible/playbooks/adhoc/uninstall.yml

Uninstalling Nodes

You can also uninstall node components from specific hosts using the uninstall.yml playbook while leaving the remaining hosts and cluster alone:

This method should only be used when attempting to uninstall specific node hosts and not for specific masters or etcd hosts, which would require further configuration changes within the cluster.

First follow the steps in Deleting Nodes to remove the node object from the cluster, then continue with the remaining steps in this procedure.
Create a different inventory file that only references those hosts. For example, to only delete content from one node:
```
[OSEv3:children]
nodes (1)

[OSEv3:vars]
ansible_ssh_user=root
openshift_deployment_type=origin

[nodes]
node3.example.com openshift_node_group_name='node-config-infra' (2)
```
1 Only include the sections that pertain to the hosts you are interested in uninstalling.

2 Only include hosts that you want to uninstall.

Specify that new inventory file using the -i option when running the uninstall.yml playbook:

# ansible-playbook -i /path/to/new/file \
    ~/openshift-ansible/playbooks/adhoc/uninstall.yml

When the playbook completes, all OKD content should be removed from any specified hosts.

Known Issues

On failover in multiple master clusters, it is possible for the controller manager to overcorrect, which causes the system to run more pods than what was intended. However, this is a transient event and the system does correct itself over time. See https://github.com/kubernetes/kubernetes/issues/10030 for details.
If the Ansible installer fails, you can still install OKD:
- If you did not modify the SDN configuration or generate new certificates, run the deploy_cluster.yml playbook again.
- If you modified the SDN configuration, generated new certificates, or the installer fails again, you must either start over with a clean operating system installation or uninstall and install again.
- If you use virtual machines, start from a fresh image or uninstall and install again.
- If you use bare metal machines, uninstall and install again.

What’s Next?

Now that you have a working OKD instance, you can:

Deploy an integrated Docker registry.
Deploy a router.
Populate your OKD installation with a useful set of Red Hat-provided image streams and templates.

1	Only include the sections that pertain to the hosts you are interested in uninstalling.
2	Only include hosts that you want to uninstall.