Alert reference


AlertAlert IdentifierDescriptionCategoryInstanceSeverity
Kafka Broker is down1000Raised when the Kafka broker is not part of the cluster for at least 1 minute. i.e:host-1,host-2InfrastructurebrokerIDINFO, CRITICAL
Zookeeper Node is down1001Raised when the Zookeeper node is not reachable. This is information is based on the Zookeeper JMX. If it responds to JMX queries it is considered to be running.Infrastructureservice nameINFO, CRITICAL
Connect Worker is down1002Raised when the Kafka Connect worker is not responding to the API call for /connectors for more than 1 minute.Infrastructureworker URLMEDIUM
Schema Registry is down1003Raised when the Schema Registry node is not responding to the root API call for more than 1 minute.Infrastructureservice URLHIGH, INFO
Under replicated partitions1005Raised when there are (topic, partitions) not meeting the replication factor set.InfrastructurepartitionsHIGH, INFO
Partitions offline1006Raised when there are partitions which do not have an active leader. These partitions are not writable or readable.InfrastructurebrokersHIGH, INFO
Active Controllers1007Raised when the number of active controllers is not 1. Each cluster should have exactly one controller.InfrastructurebrokersHIGH, INFO
Multiple Broker Versions1008Raised when there are brokers in the cluster running on different Kafka version.Infrastructurebrokers versionsHIGH, INFO
File-open descriptors high capacity on Brokers1009A broker has too many open file descriptorsInfrastructurebrokerIDHIGH, INFO, CRITICAL
Average % the request handler is idle1010Raised when the average fraction of time the request handler threads are idle. When the valueis smaller than 0.02 the alert level is CRITICAL. When the value is smaller than 0.1 the alert level is HIGH.InfrastructurebrokerIDHIGH, INFO, CRITICAL
Fetch requests failure1011Raised when the Fetch request rate (the value is per second) for requests that failed is greater than a threshold. If the value is greater than 0.1 the alert level is set to CRITICAL otherwise is set to HIGH.InfrastructurebrokerIDHIGH, INFO, CRITICAL
Produce requests failure1012Raised when the Producer request rate (the value is per second) for requests that failed is greater than a threshold. If the value is greater than 0.1 the alert level is set to CRITICAL otherwise is set to HIGH.InfrastructurebrokerIDHIGH, INFO, CRITICAL
Broker disk usage is greater than the cluster average1013Raised when the Kafka Broker disk usage is greater than the cluster average. We provide by default a threshold of 1GB disk usage.InfrastructurebrokerIDMEDIUM, INFO
Leader Imbalance1014Raised when the Kafka Broker has more leader replicas than the cluster average.InfrastructurebrokerIDINFO
Consumer Lag exceeded2000Raises an alert when the consumer lag exceeds the threshold on any partition.ConsumerstopicHIGH, INFO
Connector deleted3000Connector was deletedKafka Connectconnector nameINFO
Topic has been created4000New topic was addedTopicstopicINFO
Topic has been deleted4001Topic was deletedTopicstopicINFO
Topic data has been deleted4002Records from topic were deletedTopicstopicINFO
Data Produced5000Raises an alert when the data produced on a topic doesn’t match expected thresholdData ProducedtopicLOW, INFO
Connector Failed6000Raises an alert when a connector, or any worker in a connector is downAppsconnectorLOW, INFO
--
Last modified: April 24, 2024