×

To assist in troubleshooting a failed OKD installation, you can gather logs from the bootstrap and control plane machines. You can also get debug information from the installation program.

If you are unable to resolve the issue by using the logs and debug information, see "Determining where installation issues occur" in the Additional resources section.

If your OKD installation fails and the debug output or logs contain network timeouts or other connectivity errors, review the guidelines "Configuring your firewall" in the Additional resources section. By gathering logs from your firewall and load balancer, you can diagnose network-related errors.

Prerequisites

  • You attempted to install an OKD cluster and the installation failed.

Gathering logs from a failed installation

If you provided an SSH key to your installation program, you can gather data about your failed installation.

You use a different command to gather logs about an unsuccessful installation than to gather logs from a running cluster. If you must gather logs from a running cluster, use the oc adm must-gather command.

Prerequisites
  • Your OKD installation failed before the bootstrap process finished. The bootstrap node is running and accessible through SSH.

  • The ssh-agent process is active on your computer, and you provided the same SSH key to both the ssh-agent process and the installation program.

  • If you tried to install a cluster on infrastructure that you provisioned, you must have the fully qualified domain names of the bootstrap and control plane nodes.

Procedure
  1. Generate the commands that are required to obtain the installation logs from the bootstrap and control plane machines:

    • If you used installer-provisioned infrastructure, change to the directory that contains the installation program and run the following command:

      $ ./openshift-install gather bootstrap --dir <installation_directory>

      The installation_directory placeholder is for the directory you specified when you ran ./openshift-install create cluster. This directory contains the OKD definition files that the installation program creates.

      For installer-provisioned infrastructure, the installation program stores information about the cluster, so you do not specify the hostnames or IP addresses.

    • If you used infrastructure that you provisioned yourself, change to the directory that contains the installation program and run the following command:

      $ ./openshift-install gather bootstrap --dir <installation_directory> \
          --bootstrap <bootstrap_address> \
          --master <master_1_address> \
          --master <master_2_address> \
          --master <master_3_address>

      where:

      • installation_directory:: Specifies the same directory you specified when you ran ./openshift-install create cluster. This directory contains the OKD definition files that the installation program creates.

      • <bootstrap_address>:: Specifies the fully qualified domain name or IP address of the cluster’s bootstrap machine.

      • <master_*_address>:: For each control plane, or master, machine in your cluster, replace this placeholder with its fully qualified domain name or IP address.

        A default cluster contains three control plane machines. List all of your control plane machines as shown, no matter how many your cluster uses.

      Example output
      INFO Pulling debug logs from the bootstrap machine
      INFO Bootstrap gather logs captured here "<installation_directory>/log-bundle-<timestamp>.tar.gz"

      If you open a Red Hat support case about your installation failure, include the compressed logs when opening a Red Hat support case.

Manually gathering logs with SSH access to your hosts

Manually gather logs in situations where must-gather or automated collection methods do not work.

By default, SSH access to the OKD nodes is disabled on the OpenStack based installations.

Prerequisites
  • You must have SSH access to your hosts.

Procedure
  1. Collect the bootkube.service service logs from the bootstrap host by entering the journalctl command:

    $ journalctl -b -f -u bootkube.service
  2. Collect the container logs of the bootstrap host by using the podman logs. Podman logs are shown as a loop to get all of the container logs from the host.

    $ for pod in $(sudo podman ps -a -q); do sudo podman logs $pod; done
  3. Alternatively, collect the container logs of the host by entering the tail command:

    # tail -f /var/lib/containers/storage/overlay-containers/*/userdata/ctr.log
  4. Collect the kubelet.service and crio.service service logs from the control plane and compute hosts using the journalctl command by running:

    $ journalctl -b -f -u kubelet.service -u crio.service
  5. Collect the control plane and compute host container logs by entering the tail command:

    $ sudo tail -f /var/log/containers/*

Manually gathering logs without SSH access to your host(s)

Manually gather logs in situations where must-gather or automated collection methods do not work.

If you do not have SSH access to your node, you can access the systems journal to investigate what is happening on your host.

Prerequisites
  • Your OKD installation must be complete.

  • Your API service is still functional.

  • You have system administrator privileges.

Procedure
  1. Access journald unit logs under /var/log by running:

    $ oc adm node-logs --role=master -u kubelet
  2. Access host file paths under /var/log by running:

    $ oc adm node-logs --role=master --path=openshift-apiserver

Getting debug information from the installation program

You can choose between two methods to get debug information from the installation program.

Procedure
  • Look at debug messages from a past installation in the hidden .openshift_install.log file. To do this task, enter a command similar to the following example:

    $ cat ~/<installation_directory>/.openshift_install.log

    For <installation_directory>, specify the same directory you specified when you ran ./openshift-install create cluster.

  • Change to the directory that contains the installation program and re-run the command with the --log-level=debug argument:

    $ ./openshift-install create cluster --dir <installation_directory> --log-level debug

    For <installation_directory>, specify the same directory you specified when you ran ./openshift-install create cluster.

Reinstalling the OKD cluster

If you are unable to debug and resolve issues in the failed OKD installation, consider installing a new OKD cluster. Before starting the installation process again, you must complete thorough cleanup.

For a user-provisioned infrastructure installation, you must manually destroy the cluster and delete all associated resources. The following procedure is for an installer-provisioned infrastructure installation.

Procedure
  1. Destroy the cluster and remove all the resources associated with the cluster, including the hidden installer state files in the installation directory:

    $ ./openshift-install destroy cluster --dir <installation_directory>

    Where <installation_directory> is the directory you specified when you ran ./openshift-install create cluster. This directory contains the OKD definition files that the installation program creates.

  2. Before reinstalling the cluster, delete the installation directory by running a command similar to the following command:

    $ rm -rf <installation_directory>
  3. Follow the procedure for installing a new OKD cluster.