×

You can collect metrics to monitor how cluster components and your own workloads are performing.

Understanding metrics

In OKD 4, cluster components are monitored by scraping metrics exposed through service endpoints. You can also configure metrics collection for user-defined projects.

You can define the metrics that you want to provide for your own workloads by using Prometheus client libraries at the application level.

In OKD, metrics are exposed through an HTTP service endpoint under the /metrics canonical name. You can list all available metrics for a service by running a curl query against http://<endpoint>/metrics. For instance, you can expose a route to the prometheus-example-app example service and then run the following to view all of its available metrics:

$ curl http://<example_app_endpoint>/metrics
Example output
# HELP http_requests_total Count of all HTTP requests
# TYPE http_requests_total counter
http_requests_total{code="200",method="get"} 4
http_requests_total{code="404",method="get"} 2
# HELP version Version information about this binary
# TYPE version gauge
version{version="v0.1.0"} 1

Setting up metrics collection for user-defined projects

You can create a ServiceMonitor resource to scrape metrics from a service endpoint in a user-defined project. This assumes that your application uses a Prometheus client library to expose metrics to the /metrics canonical name.

This section describes how to deploy a sample service in a user-defined project and then create a ServiceMonitor resource that defines how that service should be monitored.

Deploying a sample service

To test monitoring of a service in a user-defined project, you can deploy a sample service.

Procedure
  1. Create a YAML file for the service configuration. In this example, it is called prometheus-example-app.yaml.

  2. Add the following deployment and service configuration details to the file:

    apiVersion: v1
    kind: Namespace
    metadata:
      name: ns1
    ---
    apiVersion: apps/v1
    kind: Deployment
    metadata:
      labels:
        app: prometheus-example-app
      name: prometheus-example-app
      namespace: ns1
    spec:
      replicas: 1
      selector:
        matchLabels:
          app: prometheus-example-app
      template:
        metadata:
          labels:
            app: prometheus-example-app
        spec:
          containers:
          - image: ghcr.io/rhobs/prometheus-example-app:0.4.1
            imagePullPolicy: IfNotPresent
            name: prometheus-example-app
    ---
    apiVersion: v1
    kind: Service
    metadata:
      labels:
        app: prometheus-example-app
      name: prometheus-example-app
      namespace: ns1
    spec:
      ports:
      - port: 8080
        protocol: TCP
        targetPort: 8080
        name: web
      selector:
        app: prometheus-example-app
      type: ClusterIP

    This configuration deploys a service named prometheus-example-app in the user-defined ns1 project. This service exposes the custom version metric.

  3. Apply the configuration to the cluster:

    $ oc apply -f prometheus-example-app.yaml

    It takes some time to deploy the service.

  4. You can check that the pod is running:

    $ oc -n ns1 get pod
    Example output
    NAME                                      READY     STATUS    RESTARTS   AGE
    prometheus-example-app-7857545cb7-sbgwq   1/1       Running   0          81m

Specifying how a service is monitored

To use the metrics exposed by your service, you must configure OKD monitoring to scrape metrics from the /metrics endpoint. You can do this using a ServiceMonitor custom resource definition (CRD) that specifies how a service should be monitored, or a PodMonitor CRD that specifies how a pod should be monitored. The former requires a Service object, while the latter does not, allowing Prometheus to directly scrape metrics from the metrics endpoint exposed by a pod.

This procedure shows you how to create a ServiceMonitor resource for a service in a user-defined project.

Prerequisites
  • You have access to the cluster as a user with the cluster-admin role or the monitoring-edit role.

  • You have enabled monitoring for user-defined projects.

  • For this example, you have deployed the prometheus-example-app sample service in the ns1 project.

    The prometheus-example-app sample service does not support TLS authentication.

Procedure
  1. Create a YAML file for the ServiceMonitor resource configuration. In this example, the file is called example-app-service-monitor.yaml.

  2. Add the following ServiceMonitor resource configuration details:

    apiVersion: monitoring.coreos.com/v1
    kind: ServiceMonitor
    metadata:
      labels:
        k8s-app: prometheus-example-monitor
      name: prometheus-example-monitor
      namespace: ns1
    spec:
      endpoints:
      - interval: 30s
        port: web
        scheme: http
      selector:
        matchLabels:
          app: prometheus-example-app

    This defines a ServiceMonitor resource that scrapes the metrics exposed by the prometheus-example-app sample service, which includes the version metric.

    A ServiceMonitor resource in a user-defined namespace can only discover services in the same namespace. That is, the namespaceSelector field of the ServiceMonitor resource is always ignored.

  3. Apply the configuration to the cluster:

    $ oc apply -f example-app-service-monitor.yaml

    It takes some time to deploy the ServiceMonitor resource.

  4. You can check that the ServiceMonitor resource is running:

    $ oc -n ns1 get servicemonitor
    Example output
    NAME                         AGE
    prometheus-example-monitor   81m

Next steps