Kafka monitoring integration
The New Relic Kafka on-host integration reports metrics and configuration data from your Kafka service. We instrument all the key elements of your cluster, including brokers (both ZooKeeper and Bootstrap), producers, consumers, and topics.
Gain deep insights into Kafka performance with seamless data integration into New Relic. Monitor key metrics for clusters, producers, consumers, and topics effortlessly, all within our powerful platform. Create alerts to stay ahead of spikes, build custom dashboards for tailored views, and proactively optimize your Kafka monitoring.
Configuration settings
The following collapser contains all the configuration settings available:
Configure KafkaBrokerSample and KafkaTopicSample collection
The Kafka integration collects both Metrics and Inventory information. Check the Applies To column below to see the settings available to each collection:
Setting | Description | Default | Applies To |
---|---|---|---|
| User-defined name to uniquely identify the cluster being monitored. Required. | N/A | M/I |
| The version of the Kafka broker you're connecting to, used for setting optimum API versions. It must match or be lower than the version from the broker. Versions older than 1.0.0 may be missing some features. Note that if the broker binary name is kafka_2.12-2.7.0 the Kafka api version to be used is 2.7.0, the preceding 2.12 is the Scala language version. |
| M/I |
| the method of discovering brokers. Options are |
| M/I |
| Set to |
| |
| Set to |
|
Zookeeper autodiscovery arguments
These are only relevant when the autodiscover_strategy
option is set to zookeeper
.
Setting | Description | Default | Applies To |
---|---|---|---|
| The list of Apache ZooKeeper hosts (in JSON format) that need to be connected. If |
| M/I |
| The ZooKeeper authentication scheme that is used to connect. Currently, the
only supported value is | N/A | M/I |
| The ZooKeeper authentication secret that is used to connect. Should be of
the form | N/A | M/I |
| The Zookeeper node under which the Kafka configuration resides. Defaults to
|
| M/I |
| Use a specific listener to connect to a broker. If unset, the first listener
that passes a successful test connection is used. Supported values are
| N/A | M/I |
Bootstrap broker discovery arguments
These are only relevant when the autodiscover_strategy
option is set tobootstrap
.
Setting | Description | Default | Applies To |
---|---|---|---|
| The host for the bootstrap broker. If | N/A | M/I |
| The Kafka port for the bootstrap broker. | N/A | M/I |
| The protocol to use to connect to the bootstrap broker. Supported values are Note the |
| M/I |
| The JMX port to use for collection on each broker in the cluster. Note that all discovered brokers should have JMX active on this port | N/A | M/I |
| The JMX user to use for collection on each broker in the cluster. | N/A | M/I |
| The JMX password to use for collection on each broker in the cluster. | N/A | M/I |
JMX options
These options apply to all JMX connections on the instance.
Setting | Description | Default | Applies To |
---|---|---|---|
| The filepath of the keystore containing the JMX client's SSL certificate. | N/A | M/I |
| The password for the JMX SSL key store. | N/A | M/I |
| The filepath of the trust keystore containing the JMX server's SSL certificate. | N/A | M/I |
| The password for the JMX trust store. | N/A | M/I |
| The default user that is connecting to the JMX host to collect metrics. If the username field is omitted for a JMX host, this value will be used. | admin | M/I |
| The default password to connect to the JMX host. If the password field is omitted for a JMX host, this value will be used. | admin | M/I |
| The timeout for individual JMX queries in milliseconds. |
| M/I |
Broker TLS connection options
You need these options if the broker protocol is SSL
or SASL_SSL
.
Setting | Description | Default | Applies To |
---|---|---|---|
| The certificate authority file for SSL and SASL_SSL listeners, in PEM format. | N/A | M/I |
| The client certificate file for SSL and SASL_SSL listeners, in PEM format. | N/A | M/I |
| The client key file for SSL and SASL_SSL listeners, in PEM format. | N/A | M/I |
| Skip verifying the server's certificate chain and host name. |
| M/I |
Broker SASL and Kerberos connection options
You need these options if the broker protocol is SASL_PLAINTEXT
or SASL_SSL
.
Setting | Description | Default | Applies To |
---|---|---|---|
| The type of SASL authentication to use. Supported options are | N/A | M/I |
| SASL username required with the PLAIN and SCRAM mechanisms. | N/A | M/I |
| SASL password required with the PLAIN and SCRAM mechanisms. | N/A | M/I |
| Kerberos realm required with the GSSAPI mechanism. | N/A | M/I |
| Kerberos service name required with the GSSAPI mechanism. | N/A | M/I |
| Kerberos username required with the GSSAPI mechanism. | N/A | M/I |
| Kerberos key tab path required with the GSSAPI mechanism. | N/A | M/I |
| Kerberos config path required with the GSSAPI mechanism. |
| M/I |
| Disable FAST negotiation. |
| M/I |
Broker Collection filtering
Setting | Description | Default | Applies To |
---|---|---|---|
| Collect only the metrics related to the configured bootstrap broker. Only used if You must set environments that use discovery (such as Kubernetes) to true, otherwise it will discover brokers twice: Note that activating this flag will skip |
| M/I |
| Determines how many topics we collect. Options are |
| M/I |
| JSON array of topic names to monitor. Only in effect if |
| M/I |
| Regex pattern that matches the topic names to monitor. Only in effect if
| N/A | M/I |
| Used to split topic collection across multiple instances. Should use the form |
| M/I |
| Collect the metric Topic size. Options are This is a resource-intensive metric to collect, especially against many topics. |
| M/I |
| Collect the metric Topic offset. Options are This is a resource-intensive metric to collect, especially against many topics. |
| M/I |
Configure KafkaConsumerSample and KafkaProducerSample collection
The Kafka integration collects both Metrics(M) and Inventory(I) information. Check the Applies To column below to find which settings can be used for each specific collection:
Setting | Description | Default | Applies To |
---|---|---|---|
| User-defined name to uniquely identify the cluster being monitored. Required. | N/A | M/I |
| Producers to collect. You can specify a Examples:
|
| M/I |
| Consumers to collect. You can specify a Examples:
|
| M/I |
| The default host to collect JMX metrics. If you omit the host field from a producer or consumer configuration, this value will be used. |
| M/I |
| The default port to collect JMX metrics. If you omit the port field from a producer or consumer configuration, this value will be used. |
| M/I |
| The default user that is connecting to the JMX host to collect metrics. If you omit the username field from a producer or consumer configuration, this value will be used. |
| M/I |
| The default password to connect to the JMX host. If you omit the password field from a producer or consumer configuration, this value will be used. |
| M/I |
| Set to |
| |
| Set to |
|
JMX SSL and timeout options
These options apply to all JMX connections on the instance:
Setting | Description | Default | Applies To |
---|---|---|---|
| The filepath of the keystore containing the JMX client's SSL certificate. | N/A | M/I |
| The password for the JMX SSL key store. | N/A | M/I |
| The filepath of the trust keystore containing the JMX server's SSL certificate. | N/A | M/I |
| The password for the JMX trust store. | N/A | M/I |
| The timeout for individual JMX queries in milliseconds. |
| M/I |
Configure KafkaOffsetSample collection
The Kafka integration collects both Metrics and Inventory information. Check the Applies To column below to find which settings can be used for each specific collection:
Setting | Description | Default | Applies To |
---|---|---|---|
| User-defined name to uniquely identify the cluster being monitored. Required. | N/A | M/I |
| The version of the Kafka broker you're connecting to, used for setting optimum API versions. It must match -or be lower than- the version from the broker. Versions older than 1.0.0 may be missing some features. Note that if the broker binary name is kafka_2.12-2.7.0 the Kafka api version to be used is 2.7.0, the preceding 2.12 is the Scala language version. |
| M/I |
| The method of discovering brokers. Options are |
| M/I |
| Populate consumer offset data in Note that this option will skip Broker/Consumer/Producer collection and only collect |
| M/I |
| Regex pattern that matches the consumer groups to collect offset statistics for. This is limited to collecting statistics for 300 consumer groups. Note: This option must be set when | N/A | M/I |
| Collects offset metrics from consumer groups without any active consumer, it requires |
| M/I |
| Activates an extra metric aggregation for consumerGroup by topic. It requires | N/A | M/I |
| Set to |
| |
| Set to |
|
Zookeeper autodiscovery arguments
This is only relevant when the autodiscover_strategy
option is set to zookeeper
.
Setting | Description | Default | Applies To |
---|---|---|---|
| The list of Apache ZooKeeper hosts (in JSON format) that need to be connected. If |
| M/I |
| The ZooKeeper authentication scheme that is used to connect. Currently, the
only supported value is | N/A | M/I |
| The ZooKeeper authentication secret that is used to connect. Should be of
the form | N/A | M/I |
| The Zookeeper node under which the Kafka configuration resides. Defaults to
|
| M/I |
| Use a specific listener to connect to a broker. If unset, the first listener
that passes a successful test connection is used. It supports values
| N/A | M/I |
Bootstrap broker discovery arguments
This is only relevant when the autodiscover_strategy
option is set to bootstrap
.
Setting | Description | Default | Applies To |
---|---|---|---|
| The host for the bootstrap broker. If | N/A | M/I |
| The Kafka port for the bootstrap broker. | N/A | M/I |
| The protocol to use to connect to the bootstrap broker. Supported values are Note the |
| M/I |
| The JMX port to use for collection on each broker in the cluster. Note that all discovered brokers should have JMX active on this port | N/A | M/I |
| The JMX user to use for collection on each broker in the cluster. | N/A | M/I |
| The JMX password to use for collection on each broker in the cluster. | N/A | M/I |
JMX SSL and timeout options
These apply to all JMX connections on an instance.
Setting | Description | Default | Applies To |
---|---|---|---|
| The filepath of the keystore containing the JMX client's SSL certificate. | N/A | M/I |
| The password for the JMX SSL key store. | N/A | M/I |
| The filepath of the trust keystore containing the JMX server's SSL certificate. | N/A | M/I |
| The password for the JMX trust store. | N/A | M/I |
| The default user that is connecting to the JMX host to collect metrics. If the username field is omitted for a JMX host, this value will be used. | admin | M/I |
| The default password to connect to the JMX host. If the password field is omitted for a JMX host, this value will be used. | admin | M/I |
| The timeout for individual JMX queries in milliseconds. | 10000 | M/I |
Troubleshooting
For agents monitoring producers and/or consumers, and that have Topic mode
set to All
, there may be a problem of duplicate data being reported. To stop the duplicate data: ensure that the configuration option Collect topic size
is set to false.
Ensure that zookeeper_path
is set correctly in the configuration file.
The Kafka integration uses a JMX helper tool called nrjmx
to retrieve JMX metrics from brokers, consumers, and producers. JMX needs to be enabled and configured on all brokers in the cluster. Also, firewalls need to be tuned to allow connections from the host running the integration to the brokers over the JMX port.
To check whether JMX is correctly configured, run the following command for each broker from the machine running the Kafka integration. Replace the PORT
, USERNAME
, and PASSWORD
variables with the corresponding JMX settings for the brokers:
$echo "*:*" | nrjmx -hostname MY_HOSTNAME -port MY_PORT -v -username MY_USERNAME -password MY_PASSWORD
The command should generate the output showing a long series of metrics without any errors.
The integration might show an error like the following:
$KRB Error: (6) KDC_ERR_C_PRINCIPAL_UNKNOWN Client not found in Kerberos database
Check the keytab with kinit command. Replace the highlighted fields with your values:
$$ kinit -k -t KEY_TAB_PATH USERNAME
If the username/keytab combination is correct, the command above should finish without printing any errors.
Check the realm using klist command:
$$ klist |grep "Default principal:"
You should see something like this:
$Default principal: johndoe@a_realm_name
Check that the printed user name and realm match the sasl_gssapi_realm
and sasl_gssapi_username
parameters in the integration configuration.
kafka-config.yml sample files
This configuration collects Metrics
and Inventory
including all topics discovering the brokers from two different JMX hosts:
integrations: - name: nri-kafka env: CLUSTER_NAME: testcluster1 KAFKA_VERSION: "1.0.0" AUTODISCOVER_STRATEGY: zookeeper ZOOKEEPER_HOSTS: '[{"host": "localhost", "port": 2181}, {"host": "localhost2", "port": 2181}]' ZOOKEEPER_PATH: "/kafka-root" DEFAULT_JMX_USER: username DEFAULT_JMX_PASSWORD: password TOPIC_MODE: all interval: 15s labels: env: production role: kafka inventory_source: config/kafka
This configuration collects Metrics and Inventory discovering the brokers from a JMX host with SSL:
integrations: - name: nri-kafka env: CLUSTER_NAME: testcluster1 KAFKA_VERSION: "1.0.0" AUTODISCOVER_STRATEGY: zookeeper ZOOKEEPER_HOSTS: '[{"host": "localhost", "port": 2181}]' ZOOKEEPER_PATH: "/kafka-root" DEFAULT_JMX_USER: username DEFAULT_JMX_PASSWORD: password
KEY_STORE: "/path/to/your/keystore" KEY_STORE_PASSWORD: keystore_password TRUST_STORE: "/path/to/your/truststore" TRUST_STORE_PASSWORD: truststore_password
TIMEOUT: 10000 #The timeout for individual JMX queries in milliseconds. interval: 15s labels: env: production role: kafka inventory_source: config/kafka
This configuration collects Metrics and Inventory including all topics discovering the brokers from one bootstrap broker:
integrations: - name: nri-kafka env: CLUSTER_NAME: testcluster1 AUTODISCOVER_STRATEGY: bootstrap BOOTSTRAP_BROKER_HOST: localhost BOOTSTRAP_BROKER_KAFKA_PORT: 9092 BOOTSTRAP_BROKER_KAFKA_PROTOCOL: PLAINTEXT BOOTSTRAP_BROKER_JMX_PORT: 9999 # This same port will be used to connect to all discover broker JMX BOOTSTRAP_BROKER_JMX_USER: admin BOOTSTRAP_BROKER_JMX_PASSWORD: password
LOCAL_ONLY_COLLECTION: false
COLLECT_BROKER_TOPIC_DATA: true TOPIC_MODE: "all" COLLECT_TOPIC_SIZE: false interval: 15s labels: env: production role: kafka inventory_source: config/kafka
This configuration collects only Metrics discovering the brokers from one bootstrap broker listening with TLS protocol:
integrations: - name: nri-kafka env: METRICS: true CLUSTER_NAME: testcluster1 AUTODISCOVER_STRATEGY: bootstrap BOOTSTRAP_BROKER_HOST: localhost BOOTSTRAP_BROKER_KAFKA_PORT: 9092 BOOTSTRAP_BROKER_KAFKA_PROTOCOL: SSL BOOTSTRAP_BROKER_JMX_PORT: 9999 BOOTSTRAP_BROKER_JMX_USER: admin BOOTSTRAP_BROKER_JMX_PASSWORD: password
# Kerberos authentication arguments TLS_CA_FILE: "/path/to/CA.pem" TLS_CERT_FILE: "/path/to/cert.pem" TLS_KEY_FILE: "/path/to/key.pem" TLS_INSECURE_SKIP_VERIFY: false interval: 15s labels: env: production role: kafka inventory_source: config/kafka
This configuration collects only Metrics discovering the brokers from one bootstrap broker in a Kerberos Auth Cluster:
integrations: - name: nri-kafka env: METRICS: true CLUSTER_NAME: testcluster1 AUTODISCOVER_STRATEGY: bootstrap BOOTSTRAP_BROKER_HOST: localhost BOOTSTRAP_BROKER_KAFKA_PORT: 9092 BOOTSTRAP_BROKER_KAFKA_PROTOCOL: PLAINTEXT # Currently support PLAINTEXT and SSL BOOTSTRAP_BROKER_JMX_PORT: 9999 BOOTSTRAP_BROKER_JMX_USER: admin BOOTSTRAP_BROKER_JMX_PASSWORD: password
# Kerberos authentication arguments SASL_MECHANISM: GSSAPI SASL_GSSAPI_REALM: SOMECORP.COM SASL_GSSAPI_SERVICE_NAME: Kafka SASL_GSSAPI_USERNAME: kafka SASL_GSSAPI_KEY_TAB_PATH: /etc/newrelic-infra/kafka.keytab SASL_GSSAPI_KERBEROS_CONFIG_PATH: /etc/krb5.conf SASL_GSSAPI_DISABLE_FAST_NEGOTIATION: false interval: 15s labels: env: production role: kafka inventory_source: config/kafka
This configuration collects Metrics splitting topic collection between 3 different instances:
integrations: - name: nri-kafka env: METRICS: true CLUSTER_NAME: testcluster1 KAFKA_VERSION: "1.0.0" AUTODISCOVER_STRATEGY: zookeeper ZOOKEEPER_HOSTS: '[{"host": "host1", "port": 2181}]' ZOOKEEPER_AUTH_SECRET: "username:password" ZOOKEEPER_PATH: "/kafka-root" DEFAULT_JMX_USER: username DEFAULT_JMX_PASSWORD: password TOPIC_MODE: regex TOPIC_REGEX: 'topic\d+' TOPIC_BUCKET: '1/3' interval: 15s labels: env: production role: kafka inventory_source: config/kafka - name: nri-kafka env: METRICS: true CLUSTER_NAME: testcluster2 KAFKA_VERSION: "1.0.0" AUTODISCOVER_STRATEGY: zookeeper ZOOKEEPER_HOSTS: '[{"host": "host2", "port": 2181}]' ZOOKEEPER_AUTH_SECRET: "username:password" ZOOKEEPER_PATH: "/kafka-root" DEFAULT_JMX_USER: username DEFAULT_JMX_PASSWORD: password TOPIC_MODE: regex TOPIC_REGEX: 'topic\d+' TOPIC_BUCKET: '2/3' interval: 15s labels: env: production role: kafka inventory_source: config/kafka - name: nri-kafka env: METRICS: true CLUSTER_NAME: testcluster3 KAFKA_VERSION: "1.0.0" AUTODISCOVER_STRATEGY: zookeeper ZOOKEEPER_HOSTS: '[{"host": "host3", "port": 2181}]' ZOOKEEPER_AUTH_SECRET: "username:password" ZOOKEEPER_PATH: "/kafka-root" DEFAULT_JMX_USER: username DEFAULT_JMX_PASSWORD: password TOPIC_MODE: regex TOPIC_REGEX: 'topic\d+' TOPIC_BUCKET: '3/3' interval: 15s labels: env: production role: kafka inventory_source: config/kafka
This gives an example for collecting JMX metrics from Java consumers and producers:
integrations: - name: nri-kafka env: METRICS: "true" CLUSTER_NAME: "testcluster3" PRODUCERS: '[{"host": "localhost", "port": 24, "username": "me", "password": "secret"}]' CONSUMERS: '[{"host": "localhost", "port": 24, "username": "me", "password": "secret"}]' DEFAULT_JMX_HOST: "localhost" DEFAULT_JMX_PORT: "9999" interval: 15s labels: env: production role: kafka inventory_source: config/kafka
This configuration collects consumer offset Metrics and Inventory for the cluster:
integrations: - name: nri-kafka env: CONSUMER_OFFSET: true CLUSTER_NAME: testcluster3 AUTODISCOVER_STRATEGY: bootstrap BOOTSTRAP_BROKER_HOST: localhost BOOTSTRAP_BROKER_KAFKA_PORT: 9092 BOOTSTRAP_BROKER_KAFKA_PROTOCOL: PLAINTEXT # A regex pattern that matches the consumer groups to collect metrics from CONSUMER_GROUP_REGEX: '.*' interval: 15s labels: env: production role: kafka inventory_source: config/kafka
Metrics collected by the integration
The Kafka integration collects the following metrics. Each metric name is prefixed with a category indicator and a period, such as broker.
or consumer.
.
Metric | Description |
---|---|
| Number of bytes written to a topic by the broker per second. |
| Network IO into brokers in the cluster in bytes per second. |
| Network IO out of brokers in the cluster in bytes per second. |
| Log flush rate. |
| Incoming messages per second. |
| Rate of request expiration on followers in evictions per second. |
| Rejected bytes per second. |
| Rate of replicas joining the ISR pool. |
| Rate of replicas leaving the ISR pool. |
| Leader election rate. |
| Unclean leader election rate. |
| Number of unreplicated partitions. |
| Average time per fetch request in milliseconds. |
| Average time for metadata request in milliseconds. |
| Time for metadata requests for 99th percentile in milliseconds. |
| Average time for an offset request in milliseconds. |
| Time for offset requests for 99th percentile in milliseconds. |
| Average time for a produce request in milliseconds. |
| Average time for a request to update metadata in milliseconds. |
| Time for update metadata requests for 99th percentile in milliseconds. |
| Client fetch request failures per second. |
| Time for fetch requests for 99th percentile in milliseconds. |
| Average fraction of time the request handler threads are idle. |
| Failed produce requests per second. |
| Time for produce requests for 99th percentile. |
| Topic disk size per broker and per topic. Only present if |
| Topic offset per broker and per topic. Only present if |
Metric | Description |
---|---|
| Average number of bytes fetched per request for a specific topic. |
| Average number of records in each request for a specific topic. |
| Average number of records consumed per second for a specific topic in records per second. |
| Consumer bytes per second. |
| The minimum rate at which the consumer sends fetch requests to a broke in requests per second. |
| Maximum number of bytes fetched per request for a specific topic. |
| Maximum consumer lag. |
| Rate of consumer message consumption in messages per second. |
| Rate of offset commits to Kafka in commits per second. |
| Rate of offset commits to ZooKeeper in writes per second. |
| Rate of delayed consumer request expiration in evictions per second. |
Metric | Description |
---|---|
| Age in seconds of the current producer metadata being used. |
| Total amount of buffer memory that is not being used in bytes. |
| Average number of bytes sent per partition per-request. |
| Average compression rate of record batches. |
| Average time in ms record batches spent in the record accumulator. |
| Average record size in bytes. |
| Average number of records sent per second. |
| Average number of records sent per second for a topic. |
| Producer average request latency. |
| Average time that a request was throttled by a broker in milliseconds. |
| Maximum amount of buffer memory the client can use in bytes. |
| Faction of time an appender waits for space allocation. |
| Producer bytes per second out. |
| Average compression rate of record batches for a topic. |
| Producer I/O wait time in milliseconds. |
| Max number of bytes sent per partition per-request. |
| Maximum record size in bytes. |
| Maximum request latency in milliseconds. |
| Maximum time a request was throttled by a broker in milliseconds. |
| Producer messages per second. |
| Number of producer responses per second. |
| Number of producer requests per second. |
| Current number of in-flight requests awaiting a response. |
| Number of user threads blocked waiting for buffer memory to enqueue their records. |
Metric | Description |
---|---|
| Number of partitions per topic that are not being led by their preferred replica. |
| Number of topics responding to meta data requests. |
| Whether a partition is retained by size or both size and time. A value of 0 = time and a value of 1 = both size and time. |
| Number of partitions per topic that are under-replicated. |
Metric | Description |
---|---|
| The last consumed offset on a partition by the consumer group. |
| The difference between a broker's high water mark and the consumer's offset ( |
| The offset of the last message written to a partition (high water mark). |
| The sum of lags across partitions consumed by a consumer. |
| The sum of lags across all partitions consumed by a |
| The maximum lag across all partitions consumed by a |
| The number of active consumers in this |