Monitoring the health of your infrastructure.
Lenses provides monitoring of the health of your infrastructure via JMX.
Additionally, Lenses has a number of built-in alerts for these services.
Lenses monitors (by default every 10 seconds) your entire streaming data platform infrastructure and has the following alert rules built-in:
Rule | This rule fires when |
---|---|
If you change your Kafka cluster size or replace an existing Kafka broker with another, Lenses will raise an active alert as it will detect that a broker of your Kafka cluster is no longer available. If the Kafka broker has been intentionally removed, then decommission it:
Navigate to Services.
Select the broker, click on the actions in the options menu and click on the Decommission option.
Lenses License
Lenses licnese is invalid
Kafka broker is down
A Kafka broker from the cluster is not healthy
Zookeeper node is down
A Zookeeper node is not healthy
Connect Worker is down
A Kafka Connect worker node is not healthy
Schema Registry is down
A Schema Registry instance is not healthy
Under replicated partitions
The Kafka cluster has 1 or more under-replicated partitions
Partitions offline
The Kafka cluster has 1 or more partitions offline (partitions without an active leader)
Active Controller
The Kafka cluster has 0 or more than 1 active controllers
Multiple Broker versions
The Kafka cluster is under a version upgrade, and not all brokers have been upgraded
File-open descriptors on Brokers
A Kafka broker has an alarming number of file-open descriptors. When the operating system is exceeding 90% of the available open file descriptors
Average % the request handler is idle
The average fraction of time the request handler threads are idle is dangerously low. The alert is HIGH when the value is smaller than 10%, and CRITICAL when it is smaller than 2%.
Fetch requests failure
Fetch requests are failing. If the rate of failures per second is > 10% the alert level is set to CRITICAL, otherwise it is set to HIGH.
Produce requests failure
Producer requests are failing. When the value is > 10% the alert level is set to CRITICAL, otherwise it is set to HIGH.
Broker disk usage
A Kafka broker’s disk usage is greater than the cluster average. The build-in threshold is 1 GByte.
Leader imbalance
A Kafka broker has more leader replicas than the average broker in the cluster.