How To Set Up Prometheus Metrics

Prometheus is an open-source monitoring solution that collects and stores time series data. Aleph has built-in support to expose metrics about itself in the Prometheus format. This guide describes how to enable Prometheus metrics in your Aleph instance.

Prerequisistes

Prometheus uses a pull-based model for metrics collection, i.e. services such as Aleph expose current metric values using an HTTP endpoint and the Prometheus servers “scrapes” the current values in a regular interval by sending a request to this HTTP endpoint.

Aleph consists of multiple different components, including the Aleph API, background workers, ingest-file workers, and a separate Prometheus exporter. Each of these four components exposes metrics in the Prometheus format.

In order to store the metrics data exposed by Aleph, you will need to run your own Prometheus server or use a managed Prometheus service. Please refer to the Prometheus documentation for installation instructions. If you’re hosting Aleph on using a cloud provider, many cloud providers also provide a managed Prometheus services or services compatible with the Prometheus format.

Enabling metrics

Docker Compose

If you have deployed Aleph using Docker Compose, follow these steps to expose Prometheus metrics:

In order to start exposing metrics in the Prometheus format, set the PROMETHEUS_ENABLED configuration option to true for api, worker, and ingest-file containers.
Additionally, Aleph ships with a Prometheus exporter that serves additional metrics about the Aleph instance such as number of users and collections. In order to expose these metrics, you need to add an additional service to your docker-compose.yml configuration file.
The configuration might look like the following. However, make sure to adjust this to your setup and requirements. Also, make sure to replace the version number with the Aleph version you’re using.
```
exporter:
  image: ghcr.io/alephdata/aleph:${ALEPH_TAG:-3.15.4}
  command: "gunicorn --bind 0.0.0.0:9100 --log-file - aleph.metrics.exporter:app"
  depends_on:
    - postgres
    - elasticsearch
    - redis
  tmpfs:
    - /tmp
  env_file:
    - aleph.env
```
The Aleph API uses the Gunicorn WSGI application server, running multiple worker processes. In order to provide complete metrics, data from all processes has to be combined. This is done by storing metrics data in files on the local file system. Set the PROMETHEUS_MULTIPROC_DIR configuration option to /run/prometheus.

Restart Aleph. The api, worker, and ingest-file containers now expose metrics on port 9100. Run the following command to verify that everything works as expected:

docker compose exec api curl http://localhost:9100/metrics
docker compose exec worker curl http://localhost:9100/metrics
docker compose exec ingest-file curl http://localhost:9100/metrics
docker compose exec exporter curl http://localhost:9100/metrics

You shouldn’t publicly expose the the Prometheus metrics port. Do not map the port to your host system.

Finally, you need to configure the Prometheus server to scrape data from the respective metrics endpoints. The exact steps depend on your setup and requirements. However, you might want to use Prometheus’s built-in DNS service discovery or Docker service discovery.

Kubernetes

If you have deployed Aleph to Kubernetes using the Aleph Helm chart, follow these steps to expose Prometheus metrics: