Prometheus & Grafana


About 

Lenses monitors in real-time your Streaming Data Platform and your Kafka cluster and will raise alerts for any significant metric degradation, such as consumer lag, offline or under-replicated partitions and producer SLAs. However, it does not store the metrics for historical analysis.

Lenses integrates with Prometheus and Grafana to export, store and visualize metrics for your cluster and applications.

Lenses ships with a set of pre-defined templates, that use

  • A Time Series database (Prometheus)
  • Custom JMX exporters
  • A Data Visualization application (Grafana)
  • Built-in domain intelligence about operating Kafka with confidence in production.

Setup 

1. Log in to secure.lenses.io 

Log in with your client credentials.

This is the secure area account that you get with Lenses when you sign up. Contact your account representative if you don’t have them already.

2. Download and extract the suite 

Download the suite package:

lenses-monitoring-suite-vX.Y.Z.tar.gz

Download the suite from the secure area

Extract the package:

tar -xvf lenses-monitoring-suite-vX.Y.Z.tar.gz

The directory structure is:

lenses_monitoring
├── grafana/      <- Grafana dashboards here
├── jmx_exporter/ <- fastdata_server.jar exporter here 
├── LICENCE.txt
├── prometheus/   <- Prometheus configuration here
└── README.md     <- Installation instructions here

3. Install the suite components 

Find the detailed instructions in the README.md.

There are 3 components:

  1. grafana/ - Grafana dashboards
  2. jmx_exporter/ - JMX exporter jar for Prometheus. Use it with your Kafka brokers, Connect, Schema Registry or your JVM apps.
  3. prometheus/ - Prometheus configuration

Prometheus metrics exporter 

To use the Lenses Grafana dashboard configure the Prometheus JMX exporter with the configuration files located in your client area along with a packaged export jars. The following are available for each service:

  1. broker.yml
  2. client.yml
  3. connect-worker.yml
  4. kafka-rest.yml
  5. schema-registry.yml

Start the exporter in server mode:

java -jar /path/to/fastdata_server.jar [PORT] [CONF_FILE]

Where [PORT] is the listening port for scrape requests from Prometheus and [CONF_FILE] the configuration file.

To run as a Java agent, add the following to the environment variables controlling JMX for each service:

Kafka Broker

export KAFKA_OPTS="$KAFKA_OPTS -javaagent:/path/to/fastdata_agent.jar=[PORT]:/path/to/broker.yml"

Kafka Connect

export KAFKA_OPTS="$KAFKA_OPTS -javaagent:/path/to/fastdata_agent.jar=[PORT]:/path/to/connect-worker.yml"

Lenses

export LENSES_OPTS="$LENSES_OPTS -javaagent:/path/to/fastdata_agent.jar=[PORT]:/path/to/client.yml"

Schema Registry

export SCHEMA_REGISTRY_OPTS="$SCHEMA_REGISTRY_OPTS -javaagent:/path/to/fastdata_agent.jar=[PORT]:/path/to/schema-registry.yml"

Metrics API 

Prometheus can be used to poll metric information from infrastructure services.

Some important metrics, such as consumer lag are not exposed by Kafka. Lenses provides a metrics API available at http://lenseshost:port/metrics to be added to the Prometheus targets in order to bring in additional critical monitoring information. Authentication is not required, so that Prometheus can freely poll this API.

GET /metrics

The response is a List of prometheus entries, for the consumer lag per partition and the aggregated lag per topic.

lenses_partition_consumer_lag{topic="iot_data",partition="3",consumerGroup="my.group.a"} 537
lenses_topic_consumer_lag{topic="iot_data",consumerGroup="my.group.a"} 4176

Grafana dashboards 

Kafka cluster metrics 

A 360-degree of the key metrics of your Kafka cluster is curated into a single template that allows time travel between the past 60 days (by default) of key metrics and pro-actively receives alerts and notifications when your streaming platform is under pressure or signals of partial failures appear.

Grafana Cluster Metrics

Consumer producer metrics 

Consumer Producer Metrics

Client application metrics 

These are operational metrics from your JVM-based Kafka applications. You can use it to monitor system resources’ performance and usage to detect issues at an early stage. It provides full access to how JVM apps and the Garbage Collector behaves, as well as to open file descriptors and other critical aspects of your applications.

Client Application Metrics

Consumer lag 

Consumer Lag metrics