Parts of OKD cluster monitoring are configurable. The API is accessible by setting parameters defined in various config maps.
To configure monitoring components, edit the ConfigMap
object named cluster-monitoring-config
in the openshift-monitoring
namespace.
These configurations are defined by ClusterMonitoringConfiguration.
To configure monitoring components that monitor user-defined projects, edit the ConfigMap
object named user-workload-monitoring-config
in the openshift-user-workload-monitoring
namespace.
These configurations are defined by UserWorkloadConfiguration.
The configuration file is always defined under the config.yaml
key in the config map data.
|
The AdditionalAlertmanagerConfig
resource defines settings for how a component communicates with additional Alertmanager instances.
apiVersion
Appears in: PrometheusK8sConfig, PrometheusRestrictedConfig, ThanosRulerConfig
Property | Type | Description |
---|---|---|
apiVersion |
string |
Defines the API version of Alertmanager. Possible values are |
bearerToken |
*v1.SecretKeySelector |
Defines the secret key reference containing the bearer token to use when authenticating to Alertmanager. |
pathPrefix |
string |
Defines the path prefix to add in front of the push endpoint path. |
scheme |
string |
Defines the URL scheme to use when communicating with Alertmanager instances. Possible values are |
staticConfigs |
[]string |
A list of statically configured Alertmanager endpoints in the form of |
timeout |
*string |
Defines the timeout value used when sending alerts. |
tlsConfig |
Defines the TLS settings to use for Alertmanager connections. |
The AlertmanagerMainConfig
resource defines settings for the Alertmanager component in the openshift-monitoring
namespace.
Appears in: ClusterMonitoringConfiguration
Property | Type | Description |
---|---|---|
enabled |
*bool |
A Boolean flag that enables or disables the main Alertmanager instance in the |
enableUserAlertmanagerConfig |
bool |
A Boolean flag that enables or disables user-defined namespaces to be selected for |
logLevel |
string |
Defines the log level setting for Alertmanager. The possible values are: |
nodeSelector |
map[string]string |
Defines the nodes on which the Pods are scheduled. |
resources |
*v1.ResourceRequirements |
Defines resource requests and limits for the Alertmanager container. |
tolerations |
[]v1.Toleration |
Defines tolerations for the pods. |
topologySpreadConstraints |
[]v1.TopologySpreadConstraint |
Defines a pod’s topology spread constraints. |
volumeClaimTemplate |
*monv1.EmbeddedPersistentVolumeClaim |
Defines persistent storage for Alertmanager. Use this setting to configure the persistent volume claim, including storage class, volume size, and name. |
The AlertmanagerUserWorkloadConfig
resource defines the settings for the Alertmanager instance used for user-defined projects.
Appears in: UserWorkloadConfiguration
Property | Type | Description |
---|---|---|
enabled |
bool |
A Boolean flag that enables or disables a dedicated instance of Alertmanager for user-defined alerts in the |
enableAlertmanagerConfig |
bool |
A Boolean flag to enable or disable user-defined namespaces to be selected for |
logLevel |
string |
Defines the log level setting for Alertmanager for user workload monitoring. The possible values are |
resources |
*v1.ResourceRequirements |
Defines resource requests and limits for the Alertmanager container. |
nodeSelector |
map[string]string |
Defines the nodes on which the pods are scheduled. |
tolerations |
[]v1.Toleration |
Defines tolerations for the pods. |
volumeClaimTemplate |
*monv1.EmbeddedPersistentVolumeClaim |
Defines persistent storage for Alertmanager. Use this setting to configure the persistent volume claim, including storage class, volume size and name. |
The ClusterMonitoringConfiguration
resource defines settings that customize the default platform monitoring stack through the cluster-monitoring-config
config map in the openshift-monitoring
namespace.
Property | Type | Description |
---|---|---|
alertmanagerMain |
|
|
enableUserWorkload |
*bool |
|
k8sPrometheusAdapter |
|
|
kubeStateMetrics |
|
|
prometheusK8s |
|
|
prometheusOperator |
|
|
openshiftStateMetrics |
|
|
telemeterClient |
|
|
thanosQuerier |
|
You can use the DedicatedServiceMonitors
resource to configure dedicated Service Monitors for the Prometheus Adapter
Appears in: K8sPrometheusAdapter
Property | Type | Description |
---|---|---|
enabled |
bool |
When |
The K8sPrometheusAdapter
resource defines settings for the Prometheus Adapter component.
Appears in: ClusterMonitoringConfiguration
Property | Type | Description |
---|---|---|
audit |
*Audit |
Defines the audit configuration used by the Prometheus Adapter instance. Possible profile values are: |
nodeSelector |
map[string]string |
Defines the nodes on which the pods are scheduled. |
tolerations |
[]v1.Toleration |
Defines tolerations for the pods. |
dedicatedServiceMonitors |
Defines dedicated service monitors. |
The KubeStateMetricsConfig
resource defines settings for the kube-state-metrics
agent.
Appears in: ClusterMonitoringConfiguration
Property | Type | Description |
---|---|---|
nodeSelector |
map[string]string |
Defines the nodes on which the pods are scheduled. |
tolerations |
[]v1.Toleration |
Defines tolerations for the pods. |
The OpenShiftStateMetricsConfig
resource defines settings for the openshift-state-metrics
agent.
Appears in: ClusterMonitoringConfiguration
Property | Type | Description |
---|---|---|
nodeSelector |
map[string]string |
Defines the nodes on which the pods are scheduled. |
tolerations |
[]v1.Toleration |
Defines tolerations for the pods. |
The PrometheusK8sConfig
resource defines settings for the Prometheus component.
Appears in: ClusterMonitoringConfiguration
Property | Type | Description |
---|---|---|
additionalAlertmanagerConfigs |
Configures additional Alertmanager instances that receive alerts from the Prometheus component. By default, no additional Alertmanager instances are configured. |
|
enforcedBodySizeLimit |
string |
Enforces a body size limit for Prometheus scraped metrics. If a scraped target’s body response is larger than the limit, the scrape will fail. The following values are valid: an empty value to specify no limit, a numeric value in Prometheus size format (such as |
externalLabels |
map[string]string |
Defines labels to be added to any time series or alerts when communicating with external systems such as federation, remote storage, and Alertmanager. By default, no labels are added. |
logLevel |
string |
Defines the log level setting for Prometheus. The possible values are: |
nodeSelector |
map[string]string |
Defines the nodes on which the pods are scheduled. |
queryLogFile |
string |
Specifies the file to which PromQL queries are logged. This setting can be either a filename, in which case the queries are saved to an |
remoteWrite |
Defines the remote write configuration, including URL, authentication, and relabeling settings. |
|
resources |
*v1.ResourceRequirements |
Defines resource requests and limits for the Prometheus container. |
retention |
string |
Defines the duration for which Prometheus retains data. This definition must be specified using the following regular expression pattern: |
retentionSize |
string |
Defines the maximum amount of disk space used by data blocks plus the write-ahead log (WAL). Supported values are |
tolerations |
[]v1.Toleration |
Defines tolerations for the pods. |
topologySpreadConstraints |
[]v1.TopologySpreadConstraint |
Defines the pod’s topology spread constraints. |
volumeClaimTemplate |
*monv1.EmbeddedPersistentVolumeClaim |
Defines persistent storage for Prometheus. Use this setting to configure the persistent volume claim, including storage class, volume size and name. |
The PrometheusOperatorConfig
resource defines settings for the Prometheus Operator component.
Appears in: ClusterMonitoringConfiguration, UserWorkloadConfiguration
Property | Type | Description |
---|---|---|
logLevel |
string |
Defines the log level settings for Prometheus Operator. The possible values are |
nodeSelector |
map[string]string |
Defines the nodes on which the pods are scheduled. |
tolerations |
[]v1.Toleration |
Defines tolerations for the pods. |
The PrometheusRestrictedConfig
resource defines the settings for the Prometheus component that monitors user-defined projects.
Appears in: UserWorkloadConfiguration
Property | Type | Description |
---|---|---|
additionalAlertmanagerConfigs |
Configures additional Alertmanager instances that receive alerts from the Prometheus component. By default, no additional Alertmanager instances are configured. |
|
enforcedLabelLimit |
*uint64 |
Specifies a per-scrape limit on the number of labels accepted for a sample. If the number of labels exceeds this limit after metric relabeling, the entire scrape is treated as failed. The default value is |
enforcedLabelNameLengthLimit |
*uint64 |
Specifies a per-scrape limit on the length of a label name for a sample. If the length of a label name exceeds this limit after metric relabeling, the entire scrape is treated as failed. The default value is |
enforcedLabelValueLengthLimit |
*uint64 |
Specifies a per-scrape limit on the length of a label value for a sample. If the length of a label value exceeds this limit after metric relabeling, the entire scrape is treated as failed. The default value is |
enforcedSampleLimit |
*uint64 |
Specifies a global limit on the number of scraped samples that will be accepted. This setting overrides the |
enforcedTargetLimit |
*uint64 |
Specifies a global limit on the number of scraped targets. This setting overrides the |
externalLabels |
map[string]string |
Defines labels to be added to any time series or alerts when communicating with external systems such as federation, remote storage, and Alertmanager. By default, no labels are added. |
logLevel |
string |
Defines the log level setting for Prometheus. The possible values are |
nodeSelector |
map[string]string |
Defines the nodes on which the pods are scheduled. |
queryLogFile |
string |
Specifies the file to which PromQL queries are logged. This setting can be either a filename, in which case the queries are saved to an |
remoteWrite |
Defines the remote write configuration, including URL, authentication, and relabeling settings. |
|
resources |
*v1.ResourceRequirements |
Defines resource requests and limits for the Prometheus container. |
retention |
string |
Defines the duration for which Prometheus retains data. This definition must be specified using the following regular expression pattern: |
retentionSize |
string |
Defines the maximum amount of disk space used by data blocks plus the write-ahead log (WAL). Supported values are |
tolerations |
[]v1.Toleration |
Defines tolerations for the pods. |
volumeClaimTemplate |
*monv1.EmbeddedPersistentVolumeClaim |
Defines persistent storage for Prometheus. Use this setting to configure the storage class and size of a volume. |
url
Appears in: PrometheusK8sConfig, PrometheusRestrictedConfig
Property | Type | Description |
---|---|---|
authorization |
*monv1.SafeAuthorization |
Defines the authorization settings for remote write storage. |
basicAuth |
*monv1.BasicAuth |
Defines basic authentication settings for the remote write endpoint URL. |
bearerTokenFile |
string |
Defines the file that contains the bearer token for the remote write endpoint. However, because you cannot mount secrets in a pod, in practice you can only reference the token of the service account. |
headers |
map[string]string |
Specifies the custom HTTP headers to be sent along with each remote write request. Headers set by Prometheus cannot be overwritten. |
metadataConfig |
*monv1.MetadataConfig |
Defines settings for sending series metadata to remote write storage. |
name |
string |
Defines the name of the remote write queue. This name is used in metrics and logging to differentiate queues. If specified, this name must be unique. |
oauth2 |
*monv1.OAuth2 |
Defines OAuth2 authentication settings for the remote write endpoint. |
proxyUrl |
string |
Defines an optional proxy URL. |
queueConfig |
*monv1.QueueConfig |
Allows tuning configuration for remote write queue parameters. |
remoteTimeout |
string |
Defines the timeout value for requests to the remote write endpoint. |
sigv4 |
*monv1.Sigv4 |
Defines AWS Signature Version 4 authentication settings. |
tlsConfig |
*monv1.SafeTLSConfig |
Defines TLS authentication settings for the remote write endpoint. |
url |
string |
Defines the URL of the remote write endpoint to which samples will be sent. |
writeRelabelConfigs |
[]monv1.RelabelConfig |
Defines the list of remote write relabel configurations. |
nodeSelector
tolerations
Appears in: ClusterMonitoringConfiguration
Property | Type | Description |
---|---|---|
nodeSelector |
map[string]string |
Defines the nodes on which the pods are scheduled. |
tolerations |
[]v1.Toleration |
Defines tolerations for the pods. |
The ThanosQuerierConfig
resource defines settings for the Thanos Querier component.
Appears in: ClusterMonitoringConfiguration
Property | Type | Description |
---|---|---|
enableRequestLogging |
bool |
A Boolean flag that enables or disables request logging. The default value is |
logLevel |
string |
Defines the log level setting for Thanos Querier. The possible values are |
nodeSelector |
map[string]string |
Defines the nodes on which the pods are scheduled. |
resources |
*v1.ResourceRequirements |
Defines resource requests and limits for the Thanos Querier container. |
tolerations |
[]v1.Toleration |
Defines tolerations for the pods. |
The ThanosRulerConfig
resource defines configuration for the Thanos Ruler instance for user-defined projects.
Appears in: UserWorkloadConfiguration
Property | Type | Description |
---|---|---|
additionalAlertmanagerConfigs |
Configures how the Thanos Ruler component communicates with additional Alertmanager instances. The default value is |
|
logLevel |
string |
Defines the log level setting for Thanos Ruler. The possible values are |
nodeSelector |
map[string]string |
Defines the nodes on which the Pods are scheduled. |
resources |
*v1.ResourceRequirements |
Defines resource requests and limits for the Thanos Ruler container. |
retention |
string |
Defines the duration for which Prometheus retains data. This definition must be specified using the following regular expression pattern: |
tolerations |
[]v1.Toleration |
Defines tolerations for the pods. |
topologySpreadConstraints |
[]v1.TopologySpreadConstraint |
Defines topology spread constraints for the pods. |
volumeClaimTemplate |
*monv1.EmbeddedPersistentVolumeClaim |
Defines persistent storage for Thanos Ruler. Use this setting to configure the storage class and size of a volume. |
insecureSkipVerify
Appears in: AdditionalAlertmanagerConfig
Property | Type | Description |
---|---|---|
ca |
*v1.SecretKeySelector |
Defines the secret key reference containing the Certificate Authority (CA) to use for the remote host. |
cert |
*v1.SecretKeySelector |
Defines the secret key reference containing the public certificate to use for the remote host. |
key |
*v1.SecretKeySelector |
Defines the secret key reference containing the private key to use for the remote host. |
serverName |
string |
Used to verify the hostname on the returned certificate. |
insecureSkipVerify |
bool |
When set to |
The UserWorkloadConfiguration
resource defines the settings responsible for user-defined projects in the user-workload-monitoring-config
config map in the openshift-user-workload-monitoring
namespace. You can only enable UserWorkloadConfiguration
after you have set enableUserWorkload
to true
in the cluster-monitoring-config
config map under the openshift-monitoring
namespace.
Property | Type | Description |
---|---|---|
alertmanager |
Defines the settings for the Alertmanager component in user workload monitoring. |
|
prometheus |
Defines the settings for the Prometheus component in user workload monitoring. |
|
prometheusOperator |
Defines the settings for the Prometheus Operator component in user workload monitoring. |
|
thanosRuler |
Defines the settings for the Thanos Ruler component in user workload monitoring. |