×

Cluster Monitoring configuration reference

Parts of Cluster Monitoring are configurable. The API is accessible through parameters defined in various ConfigMaps.

Depending on which part of the stack you want to configure, edit the following:

  • The configuration of OpenShift Container Platform monitoring components in a ConfigMap called cluster-monitoring-config in the openshift-monitoring namespace. Defined by ClusterMonitoringConfiguration.

  • The configuration of components that monitor user-defined projects in a ConfigMap called user-workload-monitoring-config in the openshift-user-workload-monitoring namespace. Defined by UserWorkloadConfiguration.

The configuration file itself is always defined under the config.yaml key within the ConfigMap data.

Not all configuration parameters are exposed. Configuring Cluster Monitoring is optional. If the configuration does not exist or is empty or malformed, defaults are used.

AdditionalAlertmanagerConfig

Description

AdditionalAlertmanagerConfig defines configuration on how a component should communicate with aditional Alertmanager instances.

Required

  • apiVersion

Property Type Description

apiVersion

string

APIVersion defines the api version of Alertmanager.

bearerToken

v1.SecretKeySelector

BearerToken defines the bearer token to use when authenticating to Alertmanager.

pathPrefix

string

PathPrefix defines the path prefix to add in front of the push endpoint path.

scheme

string

Scheme the URL scheme to use when talking to Alertmanagers.

staticConfigs

array(string)

StaticConfigs a list of statically configured Alertmanagers.

timeout

string

Timeout defines the timeout used when sending alerts.

tlsConfig

TLSConfig

TLSConfig defines the TLS Config to use for alertmanager connection.

AlertmanagerMainConfig

Description

AlertmanagerMainConfig defines configuration related with the main Alertmanager instance.

Property Type Description

enabled

bool

Enabled a boolean flag to enable or disable the main Alertmanager instance under openshift-monitoring default: true

enableUserAlertmanagerConfig

bool

EnableUserAlertManagerConfig boolean flag to enable or disable user-defined namespaces to be selected for AlertmanagerConfig lookup, by default Alertmanager only looks for configuration in the namespace where it was deployed to. This will only work if the UWM Alertmanager instance is not enabled. default: false

logLevel

string

LogLevel defines the log level for Alertmanager. Possible values are: error, warn, info, debug. default: info

nodeSelector

map[string]string

NodeSelector defines which Nodes the Pods are scheduled on.

resources

v1.ResourceRequirements

Resources define resources requests and limits for single Pods.

tolerations

array(v1.Toleration)

Tolerations defines the Pods tolerations.

volumeClaimTemplate

monv1.EmbeddedPersistentVolumeClaim

VolumeClaimTemplate defines persistent storage for Alertmanager. It’s possible to configure storageClass and size of volume.

ClusterMonitoringConfiguration

Description

ClusterMonitoringConfiguration defines configuration that allows users to customise the platform monitoring stack through the cluster-monitoring-config ConfigMap in the openshift-monitoring namespace

Property Type Description

alertmanagerMain

AlertmanagerMainConfig

AlertmanagerMainConfig defines configuration related with the main Alertmanager instance.

enableUserWorkload

bool

UserWorkloadEnabled boolean flag to enable monitoring for user-defined projects.

k8sPrometheusAdapter

K8sPrometheusAdapter

K8sPrometheusAdapter defines configuration related with prometheus-adapter

kubeStateMetrics

KubeStateMetricsConfig

KubeStateMetricsConfig defines configuration related with kube-state-metrics agent

prometheusK8s

PrometheusK8sConfig

PrometheusK8sConfig defines configuration related with prometheus

prometheusOperator

PrometheusOperatorConfig

PrometheusOperatorConfig defines configuration related with prometheus-operator

openshiftStateMetrics

OpenShiftStateMetricsConfig

OpenShiftMetricsConfig defines configuration related with openshift-state-metrics agent

thanosQuerier

ThanosQuerierConfig

ThanosQuerierConfig defines configuration related with the Thanos Querier component

K8sPrometheusAdapter

Description

K8sPrometheusAdapter defines configuration related with Prometheus Adapater

Property Type Description

audit

Audit

Audit defines the audit configuration to be used by the prometheus adapter instance. Possible profile values are: "metadata, request, requestresponse, none". default: metadata

nodeSelector

map[string]string

NodeSelector defines which Nodes the Pods are scheduled on.

tolerations

array(v1.Toleration)

Tolerations defines the Pods tolerations.

KubeStateMetricsConfig

Description

KubeStateMetricsConfig defines configuration related with the kube-state-metrics agent.

Property Type Description

nodeSelector

map[string]string

NodeSelector defines which Nodes the Pods are scheduled on.

tolerations

array(v1.Toleration)

Tolerations defines the Pods tolerations.

OpenShiftStateMetricsConfig

Description

OpenShiftStateMetricsConfig holds configuration related to openshift-state-metrics agent.

Property Type Description

nodeSelector

map[string]string

NodeSelector defines which Nodes the Pods are scheduled on.

tolerations

array(v1.Toleration)

Tolerations defines the Pods tolerations.

PrometheusK8sConfig

Description

PrometheusK8sConfig holds configuration related to the Prometheus component.

Property Type Description

additionalAlertmanagerConfigs

array(AdditionalAlertmanagerConfig)

AlertmanagerConfigs holds configuration about how the Prometheus component should communicate with aditional Alertmanager instances. default: nil

externalLabels

map[string]string

ExternalLabels defines labels to be added to any time series or alerts when communicating with external systems (federation, remote storage, Alertmanager). default: nil

logLevel

string

LogLevel defines the log level for Prometheus. Possible values are: error, warn, info, debug. default: info

nodeSelector

map[string]string

NodeSelector defines which Nodes the Pods are scheduled on.

queryLogFile

string

QueryLogFile specifies the file to which PromQL queries are logged. Suports both just a filename in which case they will be saved to an emptyDir volume at /var/log/prometheus, if a full path is given an emptyDir volume will be mounted at that location. Relative paths not supported, also not supported writing to linux std streams. default: ""

remoteWrite

array(remotewritespec)

RemoteWrite Holds the remote write configuration, everything from url, authorization to relabeling

resources

v1.ResourceRequirements

Resources define resources requests and limits for single Pods.

retention

string

Retention defines the Time duration Prometheus shall retain data for. Must match the regular expression [0-9]+(ms|s|m|h|d|w|y) (milliseconds seconds minutes hours days weeks years). default: 15d

tolerations

array(v1.Toleration)

Tolerations defines the Pods tolerations.

volumeClaimTemplate

monv1.EmbeddedPersistentVolumeClaim

VolumeClaimTemplate defines persistent storage for Prometheus. It’s possible to configure storageClass and size of volume.

PrometheusOperatorConfig

Description

PrometheusOperatorConfig holds configuration related to Prometheus Operator.

Property Type Description

logLevel

string

LogLevel defines the log level for Prometheus Operator. Possible values are: error, warn, info, debug. default: info

nodeSelector

map[string]string

NodeSelector defines which Nodes the Pods are scheduled on.

tolerations

array(v1.Toleration)

Tolerations defines the Pods tolerations.

PrometheusRestrictedConfig

Description

PrometheusRestrictedConfig defines configuration related to the Prometheus component that will monitor user-defined projects.

Property Type Description

additionalAlertmanagerConfigs

array(additionalalertmanagerconfig)

AlertmanagerConfigs holds configuration about how the Prometheus component should communicate with aditional Alertmanager instances. default: nil

enforcedSampleLimit

uint64

EnforcedSampleLimit defines a global limit on the number of scraped samples that will be accepted. This overrides any SampleLimit set per ServiceMonitor or/and PodMonitor. It is meant to be used by admins to enforce the SampleLimit to keep the overall number of samples/series under the desired limit. Note that if SampleLimit is lower that value will be taken instead. default: 0

enforcedTargetLimit

uint64

EnforcedTargetLimit defines a global limit on the number of scraped targets. This overrides any TargetLimit set per ServiceMonitor or/and PodMonitor. It is meant to be used by admins to enforce the TargetLimit to keep the overall number of targets under the desired limit. Note that if TargetLimit is lower, that value will be taken instead, except if either value is zero, in which case the non-zero value will be used. If both values are zero, no limit is enforced. default: 0

externalLabels

map[string]string

ExternalLabels defines labels to be added to any time series or alerts when communicating with external systems (federation, remote storage, Alertmanager). default: nil

logLevel

string

LogLevel defines the log level for Prometheus. Possible values are: error, warn, info, debug. default: info

nodeSelector

map[string]string

NodeSelector defines which Nodes the Pods are scheduled on.

queryLogFile

string

QueryLogFile specifies the file to which PromQL queries are logged. Suports both just a filename in which case they will be saved to an emptyDir volume at /var/log/prometheus, if a full path is given an emptyDir volume will be mounted at that location. Relative paths not supported, also not supported writing to linux std streams. default: ""

remoteWrite

array(remotewritespec)

RemoteWrite Holds the remote write configuration, everything from url, authorization to relabeling

resources

v1.ResourceRequirements

Resources define resources requests and limits for single Pods.

retention

string

Retention defines the Time duration Prometheus shall retain data for. Must match the regular expression [0-9]+(ms|s|m|h|d|w|y) (milliseconds seconds minutes hours days weeks years). default: 15d

tolerations

array(v1.Toleration)

Tolerations defines the Pods tolerations.

volumeClaimTemplate

monv1.EmbeddedPersistentVolumeClaim

VolumeClaimTemplate defines persistent storage for Prometheus. It’s possible to configure storageClass and size of volume.

RemoteWriteSpec

Description

RemoteWriteSpec is an almost identical copy of monv1.RemoteWriteSpec but with the BearerToken field removed. In the future other fields might be added here.

Required

  • url

Property Type Description

authorization

monv1.SafeAuthorization

Authorization defines the authorization section for remote write

basicAuth

monv1.BasicAuth

BasicAuth defines configuration for basic authentication for the URL.

bearerTokenFile

string

BearerTokenFile defines the file where the bearer token for remote write resides.

headers

map[string]string

Headers custom HTTP headers to be sent along with each remote write request. Be aware that headers that are set by Prometheus itself can’t be overwritten.

metadataConfig

monv1.MetadataConfig

MetadataConfig configures the sending of series metadata to remote storage.

name

string

Name defines the name of the remote write queue, must be unique if specified. The name is used in metrics and logging in order to differentiate queues.

oauth2

monv1.OAuth2

OAuth2 configures OAuth2 authentication for remote write.

proxyUrl

string

ProxyURL defines an optional proxy URL

queueConfig

monv1.QueueConfig

QueueConfig allows tuning of the remote write queue parameters.

remoteTimeout

string

RemoteTimeout defines the timeout for requests to the remote write endpoint.

sigv4

monv1.Sigv4

Sigv4 allows to configures AWS’s Signature Verification 4

tlsConfig

monv1.SafeTLSConfig

TLSConfig defines the TLS configuration to use for remote write.

url

string

URL defines the URL of the endpoint to send samples to.

writeRelabelConfigs

array(monv1.RelabelConfig)

WriteRelabelConfigs defines the list of remote write relabel configurations.

TLSConfig

Description

TLSConfig configures the options for TLS connections.

Required

  • insecureSkipVerify

Property Type Description

ca

v1.SecretKeySelector

CA defines the CA cert in the Prometheus container to use for the targets.

cert

v1.SecretKeySelector

Cert defines the client cert in the Prometheus container to use for the targets.

key

v1.SecretKeySelector

Key defines the client key in the Prometheus container to use for the targets.

serverName

string

ServerName used to verify the hostname for the targets.

insecureSkipVerify

bool

InsecureSkipVerify disable target certificate validation.

ThanosQuerierConfig

Description

ThanosQuerierConfig holds configuration related to Thanos Querier component.

Property Type Description

enableRequestLogging

bool

EnableRequestLogging boolean flag to enable or disable request logging default: false

logLevel

string

LogLevel defines the log level for Thanos Querier. Possible values are: error, warn, info, debug. default: info

nodeSelector

map[string]string

NodeSelector defines which Nodes the Pods are scheduled on.

resources

v1.ResourceRequirements

Resources define resources requests and limits for single Pods.

tolerations

array(v1.Toleration)

Tolerations defines the Pods tolerations.

ThanosRulerConfig

Description

ThanosRulerConfig defines configuration for the Thanos Ruler instance for user-defined projects.

Property Type Description

additionalAlertmanagerConfigs

array(additionalalertmanagerconfig)

AlertmanagerConfigs holds configuration about how the Thanos Ruler component should communicate with aditional Alertmanager instances. default: nil

logLevel

string

LogLevel defines the log level for Thanos Ruler. Possible values are: error, warn, info, debug. default: info

nodeSelector

map[string]string

NodeSelector defines which Nodes the Pods are scheduled on.

resources

v1.ResourceRequirements

Resources define resources requests and limits for single Pods.

retention

string

Retention defines the time duration Thanos Ruler shall retain data for. Must match the regular expression [0-9]+(ms|s|m|h|d|w|y) (milliseconds seconds minutes hours days weeks years). default: 15d

tolerations

array(v1.Toleration)

Tolerations defines the Pods tolerations.

volumeClaimTemplate

monv1.EmbeddedPersistentVolumeClaim

VolumeClaimTemplate defines persistent storage for Thanos Ruler. It’s possible to configure storageClass and size of volume.

UserWorkloadConfiguration

Description

UserWorkloadConfiguration defines configuration that allows users to customise the monitoring stack responsible for user-defined projects through the user-workload-monitoring-config ConfigMap in the openshift-user-workload-monitoring namespace

Property Type Description

prometheus

PrometheusRestrictedConfig

Prometheus defines configuration for Prometheus component.

prometheusOperator

PrometheusOperatorConfig

PrometheusOperator defines configuration for prometheus-operator component.

thanosRuler

ThanosRulerConfig

ThanosRuler defines configuration for the Thanos Ruler component