The New Relic Kafka on-host integration reports metrics and configuration data from your Kafka service. We instrument all the key elements of your cluster, including brokers (both ZooKeeper and Bootstrap), producers, consumers, and topics.
Read on to install the Kafka integration, and to see what data it collects. To monitor Kafka with our Java agent, see Instrument Kafka message queues.
Our integration is compatible with Kafka versions 0.8 or higher.
Before installing the integration, make sure that you meet the following requirements:
- If Kafka is not running on Kubernetes or Amazon ECS, you must install the infrastructure agent on a host that's running Kafka. Otherwise:
- Java 8 or higher
- JMX enabled on all brokers
- Java-based consumers and producers only, and with JMX enabled
- Total number of monitored topics must be fewer than 300
For Kafka running on Kubernetes, see the Kubernetes requirements.
Kafka is a complex piece of software that is built as a distributed system. For this reason, you’ll need to ensure that the integration can contact all the required hosts and services so the data is collected correctly.
Given the distributed nature of Kafka, the actual number and list of brokers is usually not fixed by the configuration, and it is instead quite dynamic. For this reason, the Kafka integration offers two mechanisms to perform automatic discovery of the list of brokers in the cluster: Bootstrap and Zookeeper. The mechanism you use depends on the setup of the Kafka cluster being monitored.
With the bootstrap mechanism, the integration uses a bootstrap broker to perform the autodiscovery. This is a broker whose address is well known and that will be asked for any other brokers it is aware of. The integration needs to be able to contact this broker in the address provided in the bootstrap_broker_host parameter for bootstrap discovery to work.
Alternatively, the Kafta integration can also talk to a Zookeeper server in order to obtain the list of brokers. To do this, the integration needs to be provided with the following:
- The list of Zookeeper hosts to contact (zookeeper_hosts).
- The proper authentication secrets to connect with the hosts.
Together with the list of brokers it knows about, Zookeeper will also advertise which connection mechanisms are supported by each broker.
You can configure the Kafka integration to try directly with one of these mechanisms with the preferred_listener parameter. If this parameter is not provided, the integration will try to contact the brokers with all the advertised configurations until one of them succeeds.
The integration will use Zookeeper only for discovering brokers and will not retrieve metrics from it.
To correctly list the topics processed by the brokers, the integration needs to to contact brokers over the Kafka protocol. Depending on how the brokers are configured, this might require setting up SSL and/or SASL to match the broker configuration.
The Kafka integration queries JMX, a standard Java extension for exchanging metrics in Java applications. JMX is not enabled by default in Kafka brokers, and you need to enable it for metrics collection to work properly. JMX requires RMI to be enabled, and the RMI port needs to be set to the same port as JMX.
You can configure JMX to use username/password authentication, as well as SSL. If such features have been enabled in the broker's JMX settings, you need to configure the integration accordingly.
We do not recommend enabling anonymous and/or unencrypted JMX/RMI access on public or untrusted network segments because this poses a big security risk.
Producers and consumers written in Java can also be monitored through the same mechanism (JMX). JMX needs to be enabled and configured on those applications where it is not enabled by default.
Non-Java producers and consumers do not support JMX and are therefore not supported by the Kafka integration.
As a summary, the integration needs to be configured and allowed to connect to:
- Hosts listed in
zookeeper_hostsover the Zookeeper protocol, using the Zookeeper authentication mechanism (if
autodiscover_strategyis set to
- Hosts defined in
bootstrap_broker_hostover the Kafka protocol, using the Kafka broker’s authentication/transport mechanisms (if
autodiscover_strategyis set to
- All brokers in the cluster over the Kafka protocol and port, using the Kafka brokers' authentication/transport mechanisms.
- All brokers in the cluster over the JMX protocol and port, using the authentication/transport mechanisms specified in the JMX configuration of the brokers.
- All producers/consumers specified in producers and consumers over the JMX protocol and port, if you want producer/consumer monitoring. JMX settings for the consumer must be the same as for the brokers.
For the cloud: By default, Security Groups (and their equivalents in other cloud providers) in AWS do not have the required ports open by default. JMX requires two ports in order to work: the JMX port and the RMI port. These can be set to the same value when configuring the JVM to enable JMX and must be open for the integration to be able to connect to and collect metrics from brokers.
To install the Kafka integration, choose your setup:
- Advanced: It's also possible to install the integration from a tarball file. This gives you full control over the installation and configuration process.
- On-host integrations do not automatically update. For best results, regularly update the integration package and the infrastructure agent.
An integration's YAML-format configuration is where you can place required login credentials and configure how data is collected. Which options you change depend on your setup and preference. The entire environment can be monitored remotely or on any node in that environment.
There are several ways to configure the integration, depending on how it was installed:
- If enabled via Kubernetes: see Monitor services running on Kubernetes.
- If enabled via Amazon ECS: see Monitor services running on ECS.
- If installed on-host: edit the config in the integration's YAML config file,
For examples of typical configurations, see the example configurations.
With secrets management, you can configure on-host integrations with New Relic infrastructure's agent to use sensitive data (such as passwords) without having to write them as plain text into the integration's configuration file. For more information, see Secrets management.
The configuration accepts the following commands:
inventory: collects configuration status
metrics: collects performance metrics
consumer_offset: collects consumer group offset data
The configuration accepts the following arguments:
cluster_name: user-defined name to uniquely identify the cluster being monitored. Required.
kafka_version: the version of the Kafka broker you're connecting to, used for setting optimum API versions. Defaults to
1.0.0. Versions older than 1.0.0 may be missing some features.
autodiscover_strategy: the method of discovering brokers. Options are
bootstrap. Defaults to
Zookeeper autodiscovery arguments (only relevant when
zookeeper_hosts: the list of Apache ZooKeeper hosts (in JSON format) that need to be connected.
zookeeper_auth_scheme: the ZooKeeper authentication scheme that is used to connect. Currently, the only supported value is
digest. If omitted, no authentication is used.
zookeeper_auth_secret: the ZooKeeper authentication secret that is used to connect. Should be of the form
username:password. Only required if
zookeeper_path: the Zookeeper node under which the Kafka configuration resides. Defaults to
preferred_listener: use a specific listener to connect to a broker. If unset, the first listener that passes a successful test connection is used. Supported values are
SASL_SSL. Note: The
SASL_*protocols only support Kerberos (GSSAPI) authentication.
Bootstrap broker discovery arguments (only relevant when
bootstrap_broker_host: the host for the bootstrap broker.
bootstrap_broker_kafka_port: the Kafka port for the bootstrap broker.
bootstrap_broker_kafka_protocol: the protocol to use to connect to the bootstrap broker. Supported values are
SASL_SSL. Note: The
SASL_*protocols only support Kerberos (GSSAPI) authentication. Default:
bootstrap_broker_jmx_port: the JMX port to use for collection.
bootstrap_broker_jmx_user: the JMX user to use for collection.
bootstrap_broker_jmx_password: the JMX password to use for collection.
Producer and consumer collection:
producers: producers to collect. For each provider a
passwordcan be provided in JSON form.
nameis the producer’s name as it appears in Kafka.
passwordare optional and use the default if unspecified.
consumers: consumers to collect. For each consumer a
passwordcan be specified in JSON form.
nameis the consumer’s name as it appears in Kafka.
passwordare optional and use the default if unspecified.
JMX connection options:
default_jmx_host: the default host to collect JMX metrics. If the host field is omitted from a producer or consumer configuration, this value will be used.
default_jmx_port: the default port to collect JMX metrics. If the port field is omitted from a producer or consumer configuration, this value will be used.
default_jmx_user: the default user that is connecting to the JMX host to collect metrics. This field should only be used if all brokers have a non-default username. If the username field is omitted from a producer or consumer configuration, this value will be used.
default_jmx_password: the default password to connect to the JMX host. This field should only be used if all brokers have a non-default password. If the password field is omitted from a producer or consumer configuration, this value will be used.
key_store: the filepath of the keystore containing the JMX client's SSL certificate.
key_store_password: the password for the JMX SSL key store.
trust_store: the filepath of the trust keystore containing the JMX server's SSL certificate.
trust_store_password: the password for the JMX trust store.
timeout: the timeout for individual JMX queries in milliseconds. Default:
Broker connection options:
tls_ca_file: the certificate authority file for SSL and SASL_SSL listeners, in PEM format.
tls_cert_file: the client certificate file for SSL and SASL_SSL listeners, in PEM format.
tls_key_file: the client key file for SSL and SASL_SSL listeners, in PEM format.
tls_insecure_skip_verify: skip verifying the server's certificate chain and host name
sasl_mechanism: the type of SASL authentication to use. Supported options are
sasl_gssapi_realm: kerberos realm. Required for
sasl_gssapi_service_name: kerberos service name. Required for
sasl_gssapi_username: kerberos username. Required for
sasl_gssapi_key_tab_path: path to the kerberos keytab. Required for
sasl_gssapi_kerberos_config_path: path to the kerberos config file. Default:
collect_broker_topic_data: signals if broker and topic metrics are collected. Options are
false, defaults to
true. Should only be set to
falsewhen monitoring only producers and consumers, and
topic_modeis set to
local_only_collection: collect only the metrics related to the configured bootstrap broker. Only used if
consumer_group_regex: regex pattern that matches the consumer groups to collect offset statistics for. This is limited to collecting statistics for 300 consumer groups. Note:
consumer_groupshas been deprecated, use this argument instead.
topic_mode: determines how many topics we collect. Options are
collect_topic_size: collect the metric Topic size. Options are
false, defaults to
topic_sizeis a resource-intensive metric to collect.
topic_list: array of topic names to monitor. Only in effect if
topic_modeis set to
topic_regex: regex pattern that matches the topic names to monitor. Only in effect if
topic_modeis set to
topic_bucket: used to split topic collection across multiple instances. Should be of the form
<bucket number>/<number of buckets>. Default:
Labels are optional tags which help to identify collection data. Some examples are included below.
env: label to identify the environment. For example:
role: label to identify which role is accessing the data.
For more details on configuration parameters, see the kafka-config.yml.sample config file on GitHub.
For more about the general structure of on-host integration configuration, see Configuration.
Data from this service is reported to an integration dashboard.
Kafka data is attached to the following event types:
You can query this data for troubleshooting purposes or to create charts and dashboards.
For more on how to find and use your data, see Understand integration data.
The Kafka integration collects the following metric data attributes. Each metric name is prefixed with a category indicator and a period, such as
Number of bytes written to a topic by the broker per second.
Network IO into brokers in the cluster in bytes per second.
Network IO out of brokers in the cluster in bytes per second.
Log flush rate.
Incoming messages per second.
Rate of request expiration on followers in evictions per second.
Rejected bytes per second.
Rate of replicas joining the ISR pool.
Rate of replicas leaving the ISR pool.
Leader election rate.
Unclean leader election rate.
Number of unreplicated partitions.
Average time per fetch request in milliseconds.
Average time for metadata request in milliseconds.
Time for metadata requests for 99th percentile in milliseconds.
Average time for an offset request in milliseconds.
Time for offset requests for 99th percentile in milliseconds.
Average time for a produce request in milliseconds.
Average time for a request to update metadata in milliseconds.
Time for update metadata requests for 99th percentile in milliseconds.
Client fetch request failures per second.
Time for fetch requests for 99th percentile in milliseconds.
Average fraction of time the request handler threads are idle.
Failed produce requests per second.
Time for produce requests for 99th percentile.
Average number of bytes fetched per request for a specific topic.
Average number of records in each request for a specific topic.
Average number of records consumed per second for a specific topic in records per second.
Consumer bytes per second.
The minimum rate at which the consumer sends fetch requests to a broke in requests per second.
Maximum number of bytes fetched per request for a specific topic.
Maximum consumer lag.
Rate of consumer message consumption in messages per second.
Rate of offset commits to Kafka in commits per second.
Rate of offset commits to ZooKeeper in writes per second.
Rate of delayed consumer request expiration in evictions per second.
Age in seconds of the current producer metadata being used.
Total amount of buffer memory that is not being used in bytes.
Average number of bytes sent per partition per-request.
Average compression rate of record batches.
Average time in ms record batches spent in the record accumulator.
Average record size in bytes.
Average number of records sent per second.
Average number of records sent per second for a topic.
Producer average request latency.
Average time that a request was throttled by a broker in milliseconds.
Maximum amount of buffer memory the client can use in bytes.
Faction of time an appender waits for space allocation.
Producer bytes per second out.
Average compression rate of record batches for a topic.
Producer I/O wait time in milliseconds.
Max number of bytes sent per partition per-request.
Maximum record size in bytes.
Maximum request latency in milliseconds.
Maximum time a request was throttled by a broker in milliseconds.
Producer messages per second.
Number of producer responses per second.
Number of producer requests per second.
Current number of in-flight requests awaiting a response.
Number of user threads blocked waiting for buffer memory to enqueue their records.
Current topic disk size per broker in bytes.
Number of partitions per topic that are not being led by their preferred replica.
Number of topics responding to meta data requests.
Whether a partition is retained by size or both size and time. A value of 0 = time and a value of 1 = both size and time.
Number of partitions per topic that are under-replicated.
The last consumed offset on a partition by the consumer group.
The difference between a broker's high water mark and the consumer's offset (
The offset of the last message written to a partition (high water mark).
The sum of lags across partitions consumed by a consumer.
The sum of lags across all partitions consumed by a
The maximum lag across all partitions consumed by a
The Kafka integration captures the non-default broker and topic configuration parameters, and collects the topic partition schemes as reported by ZooKeeper. The data is available on the Inventory UI page under the
This integration is open source software. That means you can browse its source code and send improvements or create your own fork and build it.