×

About OKD monitoring

OKD includes a preconfigured, preinstalled, and self-updating monitoring stack that provides monitoring for core platform components. OKD delivers monitoring best practices out of the box. A set of alerts are included by default that immediately notify cluster administrators about issues with a cluster. Default dashboards in the OKD web console include visual representations of cluster metrics to help you to quickly understand the state of your cluster.

After installing OKD 4.11, cluster administrators can optionally enable monitoring for user-defined projects. By using this feature, cluster administrators, developers, and other users can specify how services and pods are monitored in their own projects. You can then query metrics, review dashboards, and manage alerting rules and silences for your own projects in the OKD web console.

Cluster administrators can grant developers and other users permission to monitor their own projects. Privileges are granted by assigning one of the predefined monitoring roles.

Logging architecture

The major components of the logging are:

Collector

The collector is a daemonset that deploys pods to each OKD node. It collects log data from each node, transforms the data, and forwards it to configured outputs. You can use the Vector collector or the legacy Fluentd collector.

Fluentd is deprecated and is planned to be removed in a future release. Red Hat provides bug fixes and support for this feature during the current release lifecycle, but this feature no longer receives enhancements. As an alternative to Fluentd, you can use Vector instead.

Log store

The log store stores log data for analysis and is the default output for the log forwarder. You can use the default LokiStack log store, the legacy Elasticsearch log store, or forward logs to additional external log stores.

The OpenShift Elasticsearch Operator is deprecated and is planned to be removed in a future release. Red Hat provides bug fixes and support for this feature during the current release lifecycle, but this feature no longer receives enhancements. As an alternative to using the OpenShift Elasticsearch Operator to manage the default log storage, you can use the Loki Operator.

Visualization

You can use a UI component to view a visual representation of your log data. The UI provides a graphical interface to search, query, and view stored logs. The OKD web console UI is provided by enabling the OKD console plugin.

The Kibana web console is now deprecated is planned to be removed in a future logging release.

Logging collects container logs and node logs. These are categorized into types:

Application logs

Container logs generated by user applications running in the cluster, except infrastructure container applications.

Infrastructure logs

Container logs generated by infrastructure namespaces: openshift*, kube*, or default, as well as journald messages from nodes.

Audit logs

Logs generated by auditd, the node audit system, which are stored in the /var/log/audit/audit.log file, and logs from the auditd, kube-apiserver, openshift-apiserver services, as well as the ovn project if enabled.

For more information on OpenShift Logging, see the OpenShift Logging documentation.

About Telemetry

Telemetry sends a carefully chosen subset of the cluster monitoring metrics to Red Hat. The Telemeter Client fetches the metrics values every four minutes and thirty seconds and uploads the data to Red Hat. These metrics are described in this document.

This stream of data is used by Red Hat to monitor the clusters in real-time and to react as necessary to problems that impact our customers. It also allows Red Hat to roll out OKD upgrades to customers to minimize service impact and continuously improve the upgrade experience.

This debugging information is available to Red Hat Support and Engineering teams with the same restrictions as accessing data reported through support cases. All connected cluster information is used by Red Hat to help make OKD better and more intuitive to use.

Information collected by Telemetry

The following information is collected by Telemetry:

System information

  • Version information, including the OKD cluster version and installed update details that are used to determine update version availability

  • Update information, including the number of updates available per cluster, the channel and image repository used for an update, update progress information, and the number of errors that occur in an update

  • The unique random identifier that is generated during an installation

  • Configuration details that help Red Hat Support to provide beneficial support for customers, including node configuration at the cloud infrastructure level, hostnames, IP addresses, Kubernetes pod names, namespaces, and services

  • The OKD framework components installed in a cluster and their condition and status

  • Events for all namespaces listed as "related objects" for a degraded Operator

  • Information about degraded software

  • Information about the validity of certificates

  • The name of the provider platform that OKD is deployed on and the data center location

Sizing Information

  • Sizing information about clusters, machine types, and machines, including the number of CPU cores and the amount of RAM used for each

  • The number of running virtual machine instances in a cluster

  • The number of etcd members and the number of objects stored in the etcd cluster

  • Number of application builds by build strategy type

Usage information

  • Usage information about components, features, and extensions

  • Usage details about Technology Previews and unsupported configurations

Telemetry does not collect identifying information such as usernames or passwords. Red Hat does not intend to collect personal information. If Red Hat discovers that personal information has been inadvertently received, Red Hat will delete such information. To the extent that any telemetry data constitutes personal data, please refer to the Red Hat Privacy Statement for more information about Red Hat’s privacy practices.

CLI troubleshooting and debugging commands

For a list of the oc client troubleshooting and debugging commands, see the OKD CLI tools documentation.