Kafka monitoring integration

The New Relic Kafka on-host integration reports metrics and configuration data from your Kafka service. We instrument all the key elements of your cluster, including brokers (both ZooKeeper and Bootstrap), producers, consumers, and topics.

To install the Kafka monitoring integration, you must run through the following steps:

Prepare for the installation.
Install and activate the integration.
Configure the integration.
Find and use data.
Optionally, see Kafka's configuration settings.

Dica

To read about best practices when monitoring Kafka, check this blogpost.

Compatibility and requirements

Kafka versions

Our integration is compatible with Kafka version 3 or lower.

Please note Apache Kafka EOL Policy as you may experience unexpected results if you use an End of Life Kafka version.

Supported operating systems

Windows
Linux

For a comprehensive list of specific Windows and Linux versions, check the table of compatible operating systems.

System requirements

A New Relic account. Don't have one? Sign up for free! No credit card required.
If Kafka is not running on Kubernetes or Amazon ECS, you can install the infrastructure agent on a Linux or Windows OS host or on a host capable of remotely accessing where Kafka is installed. Otherwise:
- If running on Kubernetes, see these requirements.
- If running on Amazon ECS, see these requirements.
Java version 8 or higher.
JMX enabled on all brokers.
Java-based consumers and producers only, and with JMX enabled.
Total number of monitored topics must be fewer than 10000.

Connectivity requirements

The integration needs to be configured and allowed to connect to:

Hosts listed in zookeeper_hosts over the Zookeeper protocol, using the Zookeeper authentication mechanism, if autodiscover_strategy is set to zookeeper.
Hosts defined in bootstrap_broker_host over the Kafka protocol, using the Kafka broker's authentication/transport mechanisms, if autodiscover_strategy is set to bootstrap.
All brokers in the cluster over the Kafka protocol and port, using the Kafka brokers' authentication/transport mechanisms.
All brokers in the cluster over the JMX protocol and port, using the authentication/transport mechanisms specified in the JMX configuration of the brokers.
All producers/consumers specified in producers and consumers over the JMX protocol and port, if you want producer/consumer monitoring. JMX settings for the consumer must be the same as for the brokers.

Importante

By default, security groups and their equivalents in other cloud providers, in AWS don't have the required ports open by default. JMX requires two ports in order to work: the JMX port and the RMI port. These can be set to the same value when configuring the JVM to enable JMX and must be open for the integration to be able to connect to and collect metrics from brokers.

Prepare for the installation

Kafka is a complex piece of software that is built as a distributed system. For this reason, you need to ensure that the integration can contact all the required hosts and services so the data is collected correctly.

Given the distributed nature of Kafka, the actual number and list of brokers is usually not fixed by the configuration, and it is instead quite dynamic. For this reason, the Kafka integration offers two mechanisms to perform automatic discovery of the list of brokers in the cluster: Bootstrap and Zookeeper. The mechanism you use depends on the setup of the Kafka cluster being monitored.

Bootstrap

With the bootstrap mechanism, the integration uses a bootstrap broker to perform the autodiscovery. This is a broker whose address is well known and that will be asked for any other brokers it is aware of. The integration needs to be able to contact this broker in the address provided in the bootstrap_broker_host parameter for bootstrap discovery to work.

Zookeeper

Alternatively, the Kafka integration can also talk to a Zookeeper server in order to obtain the list of brokers. To do this, the integration needs to be provided with the following:

The list of Zookeeper hosts, zookeeper_hosts, to contact.
The proper authentication secrets to connect with the hosts.
Together with the list of brokers it knows about, Zookeeper will also advertise which connection mechanisms are supported by each broker.
You can configure the Kafka integration to try directly with one of these mechanisms with the preferred_listener parameter. If this parameter is not provided, the integration will try to contact the brokers with all the advertised configurations until one of them succeeds.
Dica
The integration will use Zookeeper only for discovering brokers and will not retrieve metrics from it.

The Kafka integration queries JMX, a standard Java extension for exchanging metrics in Java applications. JMX is not enabled by default in Kafka brokers, and you need to enable it for metrics collection to work properly. JMX requires RMI to be enabled, and the RMI port needs to be set to the same port as JMX.

You can configure JMX to use username/password authentication, as well as SSL. If such features have been enabled in the broker's JMX settings, you need to configure the integration accordingly.

If autodiscovery is set to bootstrap, the JMX settings defined for the bootstrap broker will be applied for all other discovered brokers, so the Port and other settings should be the same on all the brokers.

Importante

We don't recommend enabling anonymous and/or unencrypted JMX/RMI access on public or untrusted network segments because this poses a big security risk.

Install and activate the integration

To install the Kafka integration, follow the instructions for your environment:

Linux installation

Follow the instructions for installing an integration, and replace the INTEGRATION_FILE_NAME variable with nri-kafka.
Change the directory to the integrations configuration folder by running:
bash
```
$cd /etc/newrelic-infra/integrations.d
```
Copy the sample configuration file by running:
bash
```
$sudo cp kafka-config.yml.sample kafka-config.yml
```
Edit the kafka-config.yml configuration file with your favorite editor. Check out some configuration file examples..

Other environments

Download the .exe installer for New Relic's Kafka integration.
To install from the Windows command prompt, run:
bash
```
$PATH\TO\nri-kafka-amd64-installer.exe
```
In the Integrations directory, C:\Program Files\New Relic\newrelic-infra\integrations.d\, create a copy of the sample configuration file by running:
bash
```
$cp kafka-config.yml.sample kafka-config.yml
```
Edit the kafka-config.yml file by using one of the kafka-config.yml sample files.

Additional notes:

Advanced: Integrations are also available in tarball format to allow for install outside of a package manager.
On-host integrations do not automatically update. For best results, regularly update the integration package and the infrastructure agent.

Este documento ajudou você na instalação?

Configure the integration

There are several ways to configure the integration, depending on how it was installed:

If enabled via Kubernetes, see Monitor services running on Kubernetes.
If enabled via Amazon ECS, see Monitor services running on ECS.
If installed on-host, edit the config in the integration's YAML config file, kafka-config.yml. An integration's YAML-format configuration is where you can place required login credentials and configure how data is collected. Which options you change depend on your setup and preference. The configuration file has common settings applicable to all integrations like interval, timeout, inventory_source. To read all about these common settings refer to our Configuration Format document.

Importante

If you are still using our Legacy configuration and definition files, refer to this document for help.

As with other integrations, one kafka-config.yml configuration file can have many instances of the integration collecting different brokers, consumers and producers metrics. You can see config examples with one or multiple instances in the kafka-config.yml sample files

Specific settings related to Kafka are defined using the env section of each instance in the kafka-config.yml configuration file. These settings control the connection to your Brokers, Zookeeper, and JMX as well as other security settings and features. The list of valid settings is described in Kafka's configuration settings.

The integration has two modes of operation on each instance, which are mutually exclusive, that you can set up with the CONSUMER_OFFSET parameter:

Consumer offset collection: set CONSUMER_OFFSET = true to collect KafkaOffsetSample.
Core collection mode: set CONSUMER_OFFSET = false to collect the rest of the samples: KafkaBrokerSample, KafkaTopicSample, KafkaProducerSample, KafkaConsumerSample.

Importante

These modes are are mutually exclusive because consumer offset collection takes a long time to run and has high performance requirements, in order to collect both groups of Samples, set two instances, one with each mode.

The values for these settings can be defined in several ways:

Adding the value directly in the config file. This is the most common way.
Replacing the values from environment variables using the {{ }} notation. Read more about using environment variable passthroughs with on-host integrations or see the example for environment variables replacement.
Using secrets management. Use this to protect sensitive information, such as passwords that would be exposed in plain text on the configuration file. For more information, see secrets management.

Offset monitoring

When setting CONSUMER_OFFSET = true, by default, only the metrics from consumer groups with active consumers (and consumer metrics) will be collected. To also collect the metrics from consumer groups with inactive consumers you must set INACTIVE_CONSUMER_GROUP_OFFSET to true.

When a consumer group is monitoring more than one topic, it's valuable to have consumer group metrics separated by topics, specially if one of the topics have inactive consumers, because then it's possible to spot in which topic the consumer group is having lag and if there are active consumers for that consumer group and topic.

To get consumer group metrics separated by topic, you must set CONSUMER_GROUP_OFFSET_BY_TOPIC to true (it defaults to false)

For more on how to set up offset monitoring, see Configure KafkaOffsetSample collection.

kafka-config.yml sample files

This configuration collects Metrics and Inventory including all topics discovering the brokers from two different JMX hosts :

integrations:
  - name: nri-kafka
    env:
      CLUSTER_NAME: testcluster1
      KAFKA_VERSION: "1.0.0"
      AUTODISCOVER_STRATEGY: zookeeper
      ZOOKEEPER_HOSTS: '[{"host": "localhost", "port": 2181}, {"host": "localhost2", "port": 2181}]'
      ZOOKEEPER_PATH: "/kafka-root"
      DEFAULT_JMX_USER: username
      DEFAULT_JMX_PASSWORD: password
      TOPIC_MODE: all
    interval: 15s
    labels:
      env: production
      role: kafka
    inventory_source: config/kafka

This configuration collects Metrics and Inventory discovering the brokers from a JMX host with SSL :

integrations:
  - name: nri-kafka
    env:
      CLUSTER_NAME: testcluster1
      KAFKA_VERSION: "1.0.0"
      AUTODISCOVER_STRATEGY: zookeeper
      ZOOKEEPER_HOSTS: '[{"host": "localhost", "port": 2181}]'
      ZOOKEEPER_PATH: "/kafka-root"
      DEFAULT_JMX_USER: username
      DEFAULT_JMX_PASSWORD: password

      KEY_STORE: "/path/to/your/keystore"
      KEY_STORE_PASSWORD: keystore_password
      TRUST_STORE: "/path/to/your/truststore"
      TRUST_STORE_PASSWORD: truststore_password

      TIMEOUT: 10000  #The timeout for individual JMX queries in milliseconds.
    interval: 15s
    labels:
      env: production
      role: kafka
    inventory_source: config/kafka

This configuration collects Metrics and Inventory including all topics discovering the brokers from one bootstrap broker :

integrations:
  - name: nri-kafka
    env:
      CLUSTER_NAME: testcluster1
      AUTODISCOVER_STRATEGY: bootstrap
      BOOTSTRAP_BROKER_HOST: localhost
      BOOTSTRAP_BROKER_KAFKA_PORT: 9092
      BOOTSTRAP_BROKER_KAFKA_PROTOCOL: PLAINTEXT
      BOOTSTRAP_BROKER_JMX_PORT: 9999  # This same port will be used to connect to all discover broker JMX
      BOOTSTRAP_BROKER_JMX_USER: admin
      BOOTSTRAP_BROKER_JMX_PASSWORD: password

      LOCAL_ONLY_COLLECTION: false

      COLLECT_BROKER_TOPIC_DATA: true
      TOPIC_MODE: "all"
      COLLECT_TOPIC_SIZE: false
    interval: 15s
    labels:
      env: production
      role: kafka
    inventory_source: config/kafka

This configuration collects only Metrics discovering the brokers from one bootstrap broker listening with TLS protocol :

integrations:
  - name: nri-kafka
    env:
      METRICS: true
      CLUSTER_NAME: testcluster1
      AUTODISCOVER_STRATEGY: bootstrap
      BOOTSTRAP_BROKER_HOST: localhost
      BOOTSTRAP_BROKER_KAFKA_PORT: 9092
      BOOTSTRAP_BROKER_KAFKA_PROTOCOL: SSL
      BOOTSTRAP_BROKER_JMX_PORT: 9999
      BOOTSTRAP_BROKER_JMX_USER: admin
      BOOTSTRAP_BROKER_JMX_PASSWORD: password

      # Kerberos authentication arguments
      TLS_CA_FILE: "/path/to/CA.pem"
      TLS_CERT_FILE: "/path/to/cert.pem"
      TLS_KEY_FILE: "/path/to/key.pem"
      TLS_INSECURE_SKIP_VERIFY: false
    interval: 15s
    labels:
      env: production
      role: kafka
    inventory_source: config/kafka

This configuration collects only Metrics discovering the brokers from one bootstrap broker in a Kerberos Auth Cluster :

integrations:
  - name: nri-kafka
    env:
      METRICS: true
      CLUSTER_NAME: testcluster1
      AUTODISCOVER_STRATEGY: bootstrap
      BOOTSTRAP_BROKER_HOST: localhost
      BOOTSTRAP_BROKER_KAFKA_PORT: 9092
      BOOTSTRAP_BROKER_KAFKA_PROTOCOL: PLAINTEXT # Currently support PLAINTEXT and SSL
      BOOTSTRAP_BROKER_JMX_PORT: 9999
      BOOTSTRAP_BROKER_JMX_USER: admin
      BOOTSTRAP_BROKER_JMX_PASSWORD: password

      # Kerberos authentication arguments
      SASL_MECHANISM: GSSAPI
      SASL_GSSAPI_REALM: SOMECORP.COM
      SASL_GSSAPI_SERVICE_NAME: Kafka
      SASL_GSSAPI_USERNAME: kafka
      SASL_GSSAPI_KEY_TAB_PATH: /etc/newrelic-infra/kafka.keytab
      SASL_GSSAPI_KERBEROS_CONFIG_PATH: /etc/krb5.conf
      SASL_GSSAPI_DISABLE_FAST_NEGOTIATION: false
    interval: 15s
    labels:
      env: production
      role: kafka
    inventory_source: config/kafka

This configuration collects Metrics splitting topic collection between 3 different instances:

integrations:
  - name: nri-kafka
    env:
      METRICS: true
      CLUSTER_NAME: testcluster1
      KAFKA_VERSION: "1.0.0"
      AUTODISCOVER_STRATEGY: zookeeper
      ZOOKEEPER_HOSTS: '[{"host": "host1", "port": 2181}]'
      ZOOKEEPER_AUTH_SECRET: "username:password"
      ZOOKEEPER_PATH: "/kafka-root"
      DEFAULT_JMX_USER: username
      DEFAULT_JMX_PASSWORD: password
      TOPIC_MODE: regex
      TOPIC_REGEX: 'topic\d+'
      TOPIC_BUCKET: '1/3'
    interval: 15s
    labels:
      env: production
      role: kafka
    inventory_source: config/kafka
  - name: nri-kafka
    env:
      METRICS: true
      CLUSTER_NAME: testcluster2
      KAFKA_VERSION: "1.0.0"
      AUTODISCOVER_STRATEGY: zookeeper
      ZOOKEEPER_HOSTS: '[{"host": "host2", "port": 2181}]'
      ZOOKEEPER_AUTH_SECRET: "username:password"
      ZOOKEEPER_PATH: "/kafka-root"
      DEFAULT_JMX_USER: username
      DEFAULT_JMX_PASSWORD: password
      TOPIC_MODE: regex
      TOPIC_REGEX: 'topic\d+'
      TOPIC_BUCKET: '2/3'
    interval: 15s
    labels:
      env: production
      role: kafka
    inventory_source: config/kafka
  - name: nri-kafka
    env:
      METRICS: true
      CLUSTER_NAME: testcluster3
      KAFKA_VERSION: "1.0.0"
      AUTODISCOVER_STRATEGY: zookeeper
      ZOOKEEPER_HOSTS: '[{"host": "host3", "port": 2181}]'
      ZOOKEEPER_AUTH_SECRET: "username:password"
      ZOOKEEPER_PATH: "/kafka-root"
      DEFAULT_JMX_USER: username
      DEFAULT_JMX_PASSWORD: password
      TOPIC_MODE: regex
      TOPIC_REGEX: 'topic\d+'
      TOPIC_BUCKET: '3/3'
    interval: 15s
    labels:
      env: production
      role: kafka
    inventory_source: config/kafka

This gives an example for collecting JMX metrics from Java consumers and producers:

integrations:
  - name: nri-kafka
    env:
      METRICS: "true"
      CLUSTER_NAME: "testcluster3"
      PRODUCERS: '[{"host": "localhost", "port": 24, "username": "me", "password": "secret"}]'
      CONSUMERS: '[{"host": "localhost", "port": 24, "username": "me", "password": "secret"}]'
      DEFAULT_JMX_HOST: "localhost"
      DEFAULT_JMX_PORT: "9999"
    interval: 15s
    labels:
      env: production
      role: kafka
    inventory_source: config/kafka

This configuration collects consumer offset Metrics and Inventory for the cluster:

integrations:
  - name: nri-kafka
    env:
      CONSUMER_OFFSET: true
      CLUSTER_NAME: testcluster3
      AUTODISCOVER_STRATEGY: bootstrap
      BOOTSTRAP_BROKER_HOST: localhost
      BOOTSTRAP_BROKER_KAFKA_PORT: 9092
      BOOTSTRAP_BROKER_KAFKA_PROTOCOL: PLAINTEXT
      # A regex pattern that matches the consumer groups to collect metrics from
      CONSUMER_GROUP_REGEX: '.*'
    interval: 15s
    labels:
      env: production
      role: kafka
    inventory_source: config/kafka

Configuration options for the integration

For more on how to find and use your data, see Kafka's configuration settings.

Find and use data

Data from this service is reported to an integration dashboard.

Kafka data is attached to the following event types:

KafkaBrokerSample
KafkaTopicSample
KafkaProducerSample
KafkaConsumerSample
KafkaOffsetSample

You can query this data for troubleshooting purposes or to create charts and dashboards.

For more on how to find and use your data, see how to understand integration data.

Metrics collected by the integration

The Kafka integration collects the following metrics. Each metric name is prefixed with a category indicator and a period, such as broker. or consumer..

Metric	Description
`broker.bytesWrittenToTopicPerSecond`	Number of bytes written to a topic by the broker per second.
`broker.IOInPerSecond`	Network IO into brokers in the cluster in bytes per second.
`broker.IOOutPerSecond`	Network IO out of brokers in the cluster in bytes per second.
`broker.logFlushPerSecond`	Log flush rate.
`broker.messagesInPerSecond`	Incoming messages per second.
`follower.requestExpirationPerSecond`	Rate of request expiration on followers in evictions per second.
`net.bytesRejectedPerSecond`	Rejected bytes per second.
`replication.isrExpandsPerSecond`	Rate of replicas joining the ISR pool.
`replication.isrShrinksPerSecond`	Rate of replicas leaving the ISR pool.
`replication.leaderElectionPerSecond`	Leader election rate.
`replication.uncleanLeaderElectionPerSecond`	Unclean leader election rate.
`replication.unreplicatedPartitions`	Number of unreplicated partitions.
`request.avgTimeFetch`	Average time per fetch request in milliseconds.
`request.avgTimeMetadata`	Average time for metadata request in milliseconds.
`request.avgTimeMetadata99Percentile`	Time for metadata requests for 99th percentile in milliseconds.
`request.avgTimeOffset`	Average time for an offset request in milliseconds.
`request.avgTimeOffset99Percentile`	Time for offset requests for 99th percentile in milliseconds.
`request.avgTimeProduceRequest`	Average time for a produce request in milliseconds.
`request.avgTimeUpdateMetadata`	Average time for a request to update metadata in milliseconds.
`request.avgTimeUpdateMetadata99Percentile`	Time for update metadata requests for 99th percentile in milliseconds.
`request.clientFetchesFailedPerSecond`	Client fetch request failures per second.
`request.fetchTime99Percentile`	Time for fetch requests for 99th percentile in milliseconds.
`request.handlerIdle`	Average fraction of time the request handler threads are idle.
`request.produceRequestsFailedPerSecond`	Failed produce requests per second.
`request.produceTime99Percentile`	Time for produce requests for 99th percentile.
`topic.diskSize`	Topic disk size per broker and per topic. Only present if `COLLECT_TOPIC_SIZE` is enabled.
`topic.offset`	Topic offset per broker and per topic. Only present if `COLLECT_TOPIC_OFFSET` is enabled.

Metric	Description
`consumer.avgFetchSizeInBytes`	Average number of bytes fetched per request for a specific topic.
`consumer.avgRecordConsumedPerTopic`	Average number of records in each request for a specific topic.
`consumer.avgRecordConsumedPerTopicPerSecond`	Average number of records consumed per second for a specific topic in records per second.
`consumer.bytesInPerSecond`	Consumer bytes per second.
`consumer.fetchPerSecond`	The minimum rate at which the consumer sends fetch requests to a broke in requests per second.
`consumer.maxFetchSizeInBytes`	Maximum number of bytes fetched per request for a specific topic.
`consumer.maxLag`	Maximum consumer lag.
`consumer.messageConsumptionPerSecond`	Rate of consumer message consumption in messages per second.
`consumer.offsetKafkaCommitsPerSecond`	Rate of offset commits to Kafka in commits per second.
`consumer.offsetZooKeeperCommitsPerSecond`	Rate of offset commits to ZooKeeper in writes per second.
`consumer.requestsExpiredPerSecond`	Rate of delayed consumer request expiration in evictions per second.

Metric	Description
`producer.ageMetadataUsedInMilliseconds`	Age in seconds of the current producer metadata being used.
`producer.availableBufferInBytes`	Total amount of buffer memory that is not being used in bytes.
`producer.avgBytesSentPerRequestInBytes`	Average number of bytes sent per partition per-request.
`producer.avgCompressionRateRecordBatches`	Average compression rate of record batches.
`producer.avgRecordAccumulatorsInMilliseconds`	Average time in ms record batches spent in the record accumulator.
`producer.avgRecordSizeInBytes`	Average record size in bytes.
`producer.avgRecordsSentPerSecond`	Average number of records sent per second.
`producer.avgRecordsSentPerTopicPerSecond`	Average number of records sent per second for a topic.
`producer.AvgRequestLatencyPerSecond`	Producer average request latency.
`producer.avgThrottleTime`	Average time that a request was throttled by a broker in milliseconds.
`producer.bufferMemoryAvailableInBytes`	Maximum amount of buffer memory the client can use in bytes.
`producer.bufferpoolWaitTime`	Faction of time an appender waits for space allocation.
`producer.bytesOutPerSecond`	Producer bytes per second out.
`producer.compressionRateRecordBatches`	Average compression rate of record batches for a topic.
`producer.iOWaitTime`	Producer I/O wait time in milliseconds.
`producer.maxBytesSentPerRequestInBytes`	Max number of bytes sent per partition per-request.
`producer.maxRecordSizeInBytes`	Maximum record size in bytes.
`producer.maxRequestLatencyInMilliseconds`	Maximum request latency in milliseconds.
`producer.maxThrottleTime`	Maximum time a request was throttled by a broker in milliseconds.
`producer.messageRatePerSecond`	Producer messages per second.
`producer.responsePerSecond`	Number of producer responses per second.
`producer.requestPerSecond`	Number of producer requests per second.
`producer.requestsWaitingResponse`	Current number of in-flight requests awaiting a response.
`producer.threadsWaiting`	Number of user threads blocked waiting for buffer memory to enqueue their records.

Metric	Description
`topic.partitionsWithNonPreferredLeader`	Number of partitions per topic that are not being led by their preferred replica.
`topic.respondMetaData`	Number of topics responding to meta data requests.
`topic.retentionSizeOrTime`	Whether a partition is retained by size or both size and time. A value of 0 = time and a value of 1 = both size and time.
`topic.underReplicatedPartitions`	Number of partitions per topic that are under-replicated.

Metric	Description
`consumer.offset`	The last consumed offset on a partition by the consumer group.
`consumer.lag`	The difference between a broker's high water mark and the consumer's offset (`consumer.hwm` - `consumer.offset`).
`consumer.hwm`	The offset of the last message written to a partition (high water mark).
`consumer.totalLag`	The sum of lags across partitions consumed by a consumer.
`consumerGroup.totalLag`	The sum of lags across all partitions consumed by a `consumerGroup`.
`consumerGroup.maxLag`	The maximum lag across all partitions consumed by a `consumerGroup`.
`consumerGroup.activeConsumers`	The number of active consumers in this `consumerGroup`.

Dica

Compatibility and requirements .css-21sua1{background:none;border:none;width:0;padding:0;}

Kafka versions

Supported operating systems

System requirements

Connectivity requirements

Importante

Prepare for the installation

Topic listing

Broker monitoring (JMX)

Consumer offset

Producer and consumer monitoring (JMX)

Install and activate the integration

Linux installation

Other environments

Windows installation

Amazon ECS installation

Kubernetes installation

Este documento ajudou você na instalação?

Configure the integration

Importante

Importante

Offset monitoring

kafka-config.yml sample files

Zookeeper discovery

Zookeeper discovery with SSL based JMX connection

Bootstrap discovery

Bootstrap discovery TLS

Bootstrap discovery kerberos auth

Zookeeper dicsovery topic bucket

Java consumer and producer

Consumer offset

Configuration options for the integration

Find and use data

Metrics collected by the integration

KafkaBrokerSample event

KafkaConsumerSample event

KafkaProducerSample event

KafkaTopicSample event

KafkaOffsetSample event

Compatibility and requirements