
Understanding metrics

In OKD 4.12, cluster components are monitored by scraping metrics exposed through service endpoints. You can also configure metrics collection for user-defined projects.

You can define the metrics that you want to provide for your own workloads by using Prometheus client libraries at the application level.

In OKD, metrics are exposed through an HTTP service endpoint under the /metrics canonical name. You can list all available metrics for a service by running a curl query against http://<endpoint>/metrics. For instance, you can expose a route to the prometheus-example-app example service and then run the following to view all of its available metrics:

$ curl http://<example_app_endpoint>/metrics
Example output
# HELP http_requests_total Count of all HTTP requests
# TYPE http_requests_total counter
http_requests_total{code="200",method="get"} 4
http_requests_total{code="404",method="get"} 2
# HELP version Version information about this binary
# TYPE version gauge
version{version="v0.1.0"} 1

Setting up metrics collection for user-defined projects

You can create a ServiceMonitor resource to scrape metrics from a service endpoint in a user-defined project. This assumes that your application uses a Prometheus client library to expose metrics to the /metrics canonical name.

This section describes how to deploy a sample service in a user-defined project and then create a ServiceMonitor resource that defines how that service should be monitored.

Deploying a sample service

To test monitoring of a service in a user-defined project, you can deploy a sample service.

  1. Create a YAML file for the service configuration. In this example, it is called prometheus-example-app.yaml.

  2. Add the following deployment and service configuration details to the file:

    apiVersion: v1
    kind: Namespace
      name: ns1
    apiVersion: apps/v1
    kind: Deployment
        app: prometheus-example-app
      name: prometheus-example-app
      namespace: ns1
      replicas: 1
          app: prometheus-example-app
            app: prometheus-example-app
          - image: ghcr.io/rhobs/prometheus-example-app:0.4.2
            imagePullPolicy: IfNotPresent
            name: prometheus-example-app
    apiVersion: v1
    kind: Service
        app: prometheus-example-app
      name: prometheus-example-app
      namespace: ns1
      - port: 8080
        protocol: TCP
        targetPort: 8080
        name: web
        app: prometheus-example-app
      type: ClusterIP

    This configuration deploys a service named prometheus-example-app in the user-defined ns1 project. This service exposes the custom version metric.

  3. Apply the configuration to the cluster:

    $ oc apply -f prometheus-example-app.yaml

    It takes some time to deploy the service.

  4. You can check that the pod is running:

    $ oc -n ns1 get pod
    Example output
    NAME                                      READY     STATUS    RESTARTS   AGE
    prometheus-example-app-7857545cb7-sbgwq   1/1       Running   0          81m

Specifying how a service is monitored

To use the metrics exposed by your service, you must configure OKD monitoring to scrape metrics from the /metrics endpoint. You can do this using a ServiceMonitor custom resource definition (CRD) that specifies how a service should be monitored, or a PodMonitor CRD that specifies how a pod should be monitored. The former requires a Service object, while the latter does not, allowing Prometheus to directly scrape metrics from the metrics endpoint exposed by a pod.

This procedure shows you how to create a ServiceMonitor resource for a service in a user-defined project.

  • You have access to the cluster as a user with the cluster-admin cluster role or the monitoring-edit cluster role.

  • You have enabled monitoring for user-defined projects.

  • For this example, you have deployed the prometheus-example-app sample service in the ns1 project.

    The prometheus-example-app sample service does not support TLS authentication.

  1. Create a new YAML configuration file named example-app-service-monitor.yaml.

  2. Add a ServiceMonitor resource to the YAML file. The following example creates a service monitor named prometheus-example-monitor to scrape metrics exposed by the prometheus-example-app service in the ns1 namespace:

    apiVersion: monitoring.coreos.com/v1
    kind: ServiceMonitor
      name: prometheus-example-monitor
      namespace: ns1 (1)
      - interval: 30s
        port: web (2)
        scheme: http
      selector: (3)
          app: prometheus-example-app
    1 Specify a user-defined namespace where your service runs.
    2 Specify endpoint ports to be scraped by Prometheus.
    3 Configure a selector to match your service based on its metadata labels.

    A ServiceMonitor resource in a user-defined namespace can only discover services in the same namespace. That is, the namespaceSelector field of the ServiceMonitor resource is always ignored.

  3. Apply the configuration to the cluster:

    $ oc apply -f example-app-service-monitor.yaml

    It takes some time to deploy the ServiceMonitor resource.

  4. Verify that the ServiceMonitor resource is running:

    $ oc -n <namespace> get servicemonitor
    Example output
    NAME                         AGE
    prometheus-example-monitor   81m

Example service endpoint authentication settings

You can configure authentication for service endpoints for user-defined project monitoring by using ServiceMonitor and PodMonitor custom resource definitions (CRDs).

The following samples show different authentication settings for a ServiceMonitor resource. Each sample shows how to configure a corresponding Secret object that contains authentication credentials and other relevant settings.

Sample YAML authentication with a bearer token

The following sample shows bearer token settings for a Secret object named example-bearer-auth in the ns1 namespace:

Example bearer token secret
apiVersion: v1
kind: Secret
  name: example-bearer-auth
  namespace: ns1
  token: <authentication_token> (1)
1 Specify an authentication token.

The following sample shows bearer token authentication settings for a ServiceMonitor CRD. The example uses a Secret object named example-bearer-auth:

Example bearer token authentication settings
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
  name: prometheus-example-monitor
  namespace: ns1
  - authorization:
        key: token (1)
        name: example-bearer-auth (2)
    port: web
      app: prometheus-example-app
1 The key that contains the authentication token in the specified Secret object.
2 The name of the Secret object that contains the authentication credentials.

Do not use bearerTokenFile to configure bearer token. If you use the bearerTokenFile configuration, the ServiceMonitor resource is rejected.

Sample YAML for Basic authentication

The following sample shows Basic authentication settings for a Secret object named example-basic-auth in the ns1 namespace:

Example Basic authentication secret
apiVersion: v1
kind: Secret
  name: example-basic-auth
  namespace: ns1
  user: <basic_username> (1)
  password: <basic_password>  (2)
1 Specify a username for authentication.
2 Specify a password for authentication.

The following sample shows Basic authentication settings for a ServiceMonitor CRD. The example uses a Secret object named example-basic-auth:

Example Basic authentication settings
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
  name: prometheus-example-monitor
  namespace: ns1
  - basicAuth:
        key: user (1)
        name: example-basic-auth (2)
        key: password (3)
        name: example-basic-auth (2)
    port: web
      app: prometheus-example-app
1 The key that contains the username in the specified Secret object.
2 The name of the Secret object that contains the Basic authentication.
3 The key that contains the password in the specified Secret object.

Sample YAML authentication with OAuth 2.0

The following sample shows OAuth 2.0 settings for a Secret object named example-oauth2 in the ns1 namespace:

Example OAuth 2.0 secret
apiVersion: v1
kind: Secret
  name: example-oauth2
  namespace: ns1
  id: <oauth2_id> (1)
  secret: <oauth2_secret> (2)
1 Specify an Oauth 2.0 ID.
2 Specify an Oauth 2.0 secret.

The following sample shows OAuth 2.0 authentication settings for a ServiceMonitor CRD. The example uses a Secret object named example-oauth2:

Example OAuth 2.0 authentication settings
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
  name: prometheus-example-monitor
  namespace: ns1
  - oauth2:
          key: id (1)
          name: example-oauth2 (2)
        key: secret (3)
        name: example-oauth2 (2)
      tokenUrl: https://example.com/oauth2/token (4)
    port: web
      app: prometheus-example-app
1 The key that contains the OAuth 2.0 ID in the specified Secret object.
2 The name of the Secret object that contains the OAuth 2.0 credentials.
3 The key that contains the OAuth 2.0 secret in the specified Secret object.
4 The URL used to fetch a token with the specified clientId and clientSecret.

Viewing a list of available metrics

As a cluster administrator or as a user with view permissions for all projects, you can view a list of metrics available in a cluster and output the list in JSON format.

  • You are a cluster administrator, or you have access to the cluster as a user with the cluster-monitoring-view cluster role.

  • You have installed the OKD CLI (oc).

  • You have obtained the OKD API route for Thanos Querier.

  • You are able to get a bearer token by using the oc whoami -t command.

    You can only use bearer token authentication to access the Thanos Querier API route.

  1. If you have not obtained the OKD API route for Thanos Querier, run the following command:

    $ oc get routes -n openshift-monitoring thanos-querier -o jsonpath='{.status.ingress[0].host}'
  2. Retrieve a list of metrics in JSON format from the Thanos Querier API route by running the following command. This command uses oc to authenticate with a bearer token.

    $ curl -k -H "Authorization: Bearer $(oc whoami -t)" https://<thanos_querier_route>/api/v1/metadata (1)
    1 Replace <thanos_querier_route> with the OKD API route for Thanos Querier.