×

When you encounter a problem, the first step is to find the specific area where the issue is happening. To narrow down the potential problematic areas, complete one or more tasks:

  • Query your cluster

  • Check your pod logs

  • Debug a pod

  • Review events

Querying your cluster

Get information about your cluster so that you can more accurately find potential problems.

Procedure
  1. Switch into a project by running the following command:

    $ oc project <project_name>
  2. Query your cluster version, cluster Operator, and node within that namespace by running the following command:

    $ oc get clusterversion,clusteroperator,node
    Example output
    NAME                                         VERSION   AVAILABLE   PROGRESSING   SINCE   STATUS
    clusterversion.config.openshift.io/version   4.16.11   True        False         62d     Cluster version is 4.16.11
    
    NAME                                                                           VERSION   AVAILABLE   PROGRESSING   DEGRADED   SINCE   MESSAGE
    clusteroperator.config.openshift.io/authentication                             4.16.11   True        False         False      62d
    clusteroperator.config.openshift.io/baremetal                                  4.16.11   True        False         False      62d
    clusteroperator.config.openshift.io/cloud-controller-manager                   4.16.11   True        False         False      62d
    clusteroperator.config.openshift.io/cloud-credential                           4.16.11   True        False         False      62d
    clusteroperator.config.openshift.io/cluster-autoscaler                         4.16.11   True        False         False      62d
    clusteroperator.config.openshift.io/config-operator                            4.16.11   True        False         False      62d
    clusteroperator.config.openshift.io/console                                    4.16.11   True        False         False      62d
    clusteroperator.config.openshift.io/control-plane-machine-set                  4.16.11   True        False         False      62d
    clusteroperator.config.openshift.io/csi-snapshot-controller                    4.16.11   True        False         False      62d
    clusteroperator.config.openshift.io/dns                                        4.16.11   True        False         False      62d
    clusteroperator.config.openshift.io/etcd                                       4.16.11   True        False         False      62d
    clusteroperator.config.openshift.io/image-registry                             4.16.11   True        False         False      55d
    clusteroperator.config.openshift.io/ingress                                    4.16.11   True        False         False      62d
    clusteroperator.config.openshift.io/insights                                   4.16.11   True        False         False      62d
    clusteroperator.config.openshift.io/kube-apiserver                             4.16.11   True        False         False      62d
    clusteroperator.config.openshift.io/kube-controller-manager                    4.16.11   True        False         False      62d
    clusteroperator.config.openshift.io/kube-scheduler                             4.16.11   True        False         False      62d
    clusteroperator.config.openshift.io/kube-storage-version-migrator              4.16.11   True        False         False      62d
    clusteroperator.config.openshift.io/machine-api                                4.16.11   True        False         False      62d
    clusteroperator.config.openshift.io/machine-approver                           4.16.11   True        False         False      62d
    clusteroperator.config.openshift.io/machine-config                             4.16.11   True        False         False      62d
    clusteroperator.config.openshift.io/marketplace                                4.16.11   True        False         False      62d
    clusteroperator.config.openshift.io/monitoring                                 4.16.11   True        False         False      62d
    clusteroperator.config.openshift.io/network                                    4.16.11   True        False         False      62d
    clusteroperator.config.openshift.io/node-tuning                                4.16.11   True        False         False      62d
    clusteroperator.config.openshift.io/openshift-apiserver                        4.16.11   True        False         False      62d
    clusteroperator.config.openshift.io/openshift-controller-manager               4.16.11   True        False         False      62d
    clusteroperator.config.openshift.io/openshift-samples                          4.16.11   True        False         False      35d
    clusteroperator.config.openshift.io/operator-lifecycle-manager                 4.16.11   True        False         False      62d
    clusteroperator.config.openshift.io/operator-lifecycle-manager-catalog         4.16.11   True        False         False      62d
    clusteroperator.config.openshift.io/operator-lifecycle-manager-packageserver   4.16.11   True        False         False      62d
    clusteroperator.config.openshift.io/service-ca                                 4.16.11   True        False         False      62d
    clusteroperator.config.openshift.io/storage                                    4.16.11   True        False         False      62d
    
    NAME                STATUS   ROLES                         AGE   VERSION
    node/ctrl-plane-0   Ready    control-plane,master,worker   62d   v1.29.7
    node/ctrl-plane-1   Ready    control-plane,master,worker   62d   v1.29.7
    node/ctrl-plane-2   Ready    control-plane,master,worker   62d   v1.29.7

For more information, see "oc get" and "Reviewing pod status".

Additional resources

Checking pod logs

Get logs from the pod so that you can review the logs for issues.

Procedure
  1. List the pods by running the following command:

    $ oc get pod
    Example output
    NAME        READY   STATUS    RESTARTS          AGE
    busybox-1   1/1     Running   168 (34m ago)     7d
    busybox-2   1/1     Running   119 (9m20s ago)   4d23h
    busybox-3   1/1     Running   168 (43m ago)     7d
    busybox-4   1/1     Running   168 (43m ago)     7d
  2. Check pod log files by running the following command:

    $ oc logs -n <namespace> busybox-1

For more information, see "oc logs", "Logging", and "Inspecting pod and container logs".

Describing a pod

Describing a pod gives you information about that pod to help with troubleshooting. The Events section provides detailed information about the pod and the containers inside of it.

Procedure
  • Describe a pod by running the following command:

    $ oc describe pod -n <namespace> busybox-1
    Example output
    Name:             busybox-1
    Namespace:        busy
    Priority:         0
    Service Account:  default
    Node:             worker-3/192.168.0.0
    Start Time:       Mon, 27 Nov 2023 14:41:25 -0500
    Labels:           app=busybox
                      pod-template-hash=<hash>
    Annotations:      k8s.ovn.org/pod-networks:
    …
    Events:
      Type    Reason   Age                   From     Message
      ----    ------   ----                  ----     -------
      Normal  Pulled   41m (x170 over 7d1h)  kubelet  Container image "quay.io/quay/busybox:latest" already present on machine
      Normal  Created  41m (x170 over 7d1h)  kubelet  Created container busybox
      Normal  Started  41m (x170 over 7d1h)  kubelet  Started container busybox

For more information, see "oc describe".

Additional resources

Reviewing events

You can review the events in a given namespace to find potential issues.

Procedure
  1. Check for events in your namespace by running the following command:

    $ oc get events -n <namespace> --sort-by=".metadata.creationTimestamp" (1)
    1 Adding the --sort-by=".metadata.creationTimestamp" flag places the most recent events at the end of the output.
  2. Optional: If the events within your specified namespace do not provide enough information, expand your query to all namespaces by running the following command:

    $ oc get events -A --sort-by=".metadata.creationTimestamp" (1)
    1 The --sort-by=".metadata.creationTimestamp" flag places the most recent events at the end of the output.

    To filter the results of all events from a cluster, you can use the grep command. For example, if you are looking for errors, the errors can appear in two different sections of the output: the TYPE or MESSAGE sections. With the grep command, you can search for keywords, such as error or failed.

  3. For example, search for a message that contains warning or error by running the following command:

    $ oc get events -A | grep -Ei "warning|error"
    Example output
    NAMESPACE    LAST SEEN   TYPE      REASON          OBJECT              MESSAGE
    openshift    59s         Warning   FailedMount     pod/openshift-1     MountVolume.SetUp failed for volume "v4-0-config-user-idp-0-file-data" : references non-existent secret key: test
  4. Optional: To clean up the events and see only recurring events, you can delete the events in the relevant namespace by running the following command:

    $ oc delete events -n <namespace> --all

For more information, see "Watching cluster events".

Additional resources

Connecting to a pod

You can directly connect to a currently running pod with the oc rsh command, which provides you with a shell on that pod.

In pods that run a low-latency application, latency issues can occur when you run the oc rsh command. Use the oc rsh command only if you cannot connect to the node by using the oc debug command.

Procedure
  • Connect to your pod by running the following command:

    $ oc rsh -n <namespace> busybox-1

For more information, see "oc rsh" and "Accessing running pods".

Additional resources

Debugging a pod

In certain cases, you do not want to directly interact with your pod that is in production.

To avoid interfering with running traffic, you can use a secondary pod that is a copy of your original pod. The secondary pod uses the same components as that of the original pod but does not have running traffic.

Procedure
  1. List the pods by running the following command:

    $ oc get pod
    Example output
    NAME        READY   STATUS    RESTARTS          AGE
    busybox-1   1/1     Running   168 (34m ago)     7d
    busybox-2   1/1     Running   119 (9m20s ago)   4d23h
    busybox-3   1/1     Running   168 (43m ago)     7d
    busybox-4   1/1     Running   168 (43m ago)     7d
  2. Debug a pod by running the following command:

    $ oc debug -n <namespace> busybox-1
    Example output
    Starting pod/busybox-1-debug, command was: sleep 3600
    Pod IP: 10.133.2.11

    If you do not see a shell prompt, press Enter.

For more information, see "oc debug" and "Starting debug pods with root access".

Running a command on a pod

If you want to run a command or set of commands on a pod without directly logging into it, you can use the oc exec -it command. You can interact with the pod quickly to get process or output information from the pod. A common use case is to run the oc exec -it command inside a script to run the same command on multiple pods in a replica set or deployment.

In pods that run a low-latency application, the oc exec command can cause latency issues.

Procedure
  • To run a command on a pod without logging into it, run the following command:

    $ oc exec -it <pod> -- <command>

For more information, see "oc exec" and "Executing remote commands in containers".