Alert Reference
This page describes the alert references for Lenses.
Alert | Alert Identifier | Description | Category | Instance | Severity |
---|---|---|---|---|---|
Kafka Broker is down | 1000 | Raised when the Kafka broker is not part of the cluster for at least 1 minute. i.e:host-1,host-2 | Infrastructure | brokerID | INFO, CRITICAL |
Zookeeper Node is down | 1001 | Raised when the Zookeeper node is not reachable. This is information is based on the Zookeeper JMX. If it responds to JMX queries it is considered to be running. | Infrastructure | service name | INFO, CRITICAL |
Connect Worker is down | 1002 | Raised when the Kafka Connect worker is not responding to the API call for /connectors for more than 1 minute. | Infrastructure | worker URL | MEDIUM |
Schema Registry is down | 1003 | Raised when the Schema Registry node is not responding to the root API call for more than 1 minute. | Infrastructure | service URL | HIGH, INFO |
Under replicated partitions | 1005 | Raised when there are (topic, partitions) not meeting the replication factor set. | Infrastructure | partitions | HIGH, INFO |
Partitions offline | 1006 | Raised when there are partitions which do not have an active leader. These partitions are not writable or readable. | Infrastructure | brokers | HIGH, INFO |
Active Controllers | 1007 | Raised when the number of active controllers is not 1. Each cluster should have exactly one controller. | Infrastructure | brokers | HIGH, INFO |
Multiple Broker Versions | 1008 | Raised when there are brokers in the cluster running on different Kafka version. | Infrastructure | brokers versions | HIGH, INFO |
File-open descriptors high capacity on Brokers | 1009 | A broker has too many open file descriptors | Infrastructure | brokerID | HIGH, INFO, CRITICAL |
Average % the request handler is idle | 1010 | Raised when the average fraction of time the request handler threads are idle. When the valueis smaller than 0.02 the alert level is CRITICAL. When the value is smaller than 0.1 the alert level is HIGH. | Infrastructure | brokerID | HIGH, INFO, CRITICAL |
Fetch requests failure | 1011 | Raised when the Fetch request rate (the value is per second) for requests that failed is greater than a threshold. If the value is greater than 0.1 the alert level is set to CRITICAL otherwise is set to HIGH. | Infrastructure | brokerID | HIGH, INFO, CRITICAL |
Produce requests failure | 1012 | Raised when the Producer request rate (the value is per second) for requests that failed is greater than a threshold. If the value is greater than 0.1 the alert level is set to CRITICAL otherwise is set to HIGH. | Infrastructure | brokerID | HIGH, INFO, CRITICAL |
Broker disk usage is greater than the cluster average | 1013 | Raised when the Kafka Broker disk usage is greater than the cluster average. We provide by default a threshold of 1GB disk usage. | Infrastructure | brokerID | MEDIUM, INFO |
Leader Imbalance | 1014 | Raised when the Kafka Broker has more leader replicas than the cluster average. | Infrastructure | brokerID | INFO |
Consumer Lag exceeded | 2000 | Raises an alert when the consumer lag exceeds the threshold on any partition. | Consumers | topic | HIGH, INFO |
Connector deleted | 3000 | Connector was deleted | Kafka Connect | connector name | INFO |
Topic has been created | 4000 | New topic was added | Topics | topic | INFO |
Topic has been deleted | 4001 | Topic was deleted | Topics | topic | INFO |
Topic data has been deleted | 4002 | Records from topic were deleted | Topics | topic | INFO |
Data Produced | 5000 | Raises an alert when the data produced on a topic doesn’t match expected threshold | Data Produced | topic | LOW, INFO |
Connector Failed | 6000 | Raises an alert when a connector, or any worker in a connector is down | Apps | connector | LOW, INFO |
Last updated