Important
Enable the AWS CloudWatch Metric Streams integration to monitor all CloudWatch metrics from your AWS services, including custom namespaces. Individual integrations are no longer our recommended option.
New Relic infrastructure integrations include an integration for reporting Amazon Elasticsearch data to New Relic. This document explains the integration's features, how to activate it, and what data can be reported.
Features
Amazon Elasticsearch Service is a fully managed service that delivers Elasticsearch’s easy-to-use APIs and real-time capabilities along with the availability, scalability, and security required by production workloads. New Relic's Elasticsearch monitoring integration allows you to track cluster status, CPU utilization, read/write latency, throughput, and other metrics, at specific points in time. Elasticsearch data is also available to query, analyze, and chart your data.
Activate integration
To enable this integration, follow standard procedures to connect AWS services to New Relic.
Configuration and polling
You can change the polling frequency and filter data using configuration options.
Default polling information for the Amazon Elasticsearch integration:
- New Relic polling interval: 5 minutes
- Amazon CloudWatch data interval: 1 minute
View and use data
To view and use this integration's data, go to one.newrelic.com > All capabilities > Infrastructure > AWS and select one of the Elasticsearch integration links.
To query and explore your data, use the DatastoreSample
event type with the appropriate provider
value:
ElasticsearchCluster
for clustersElasticsearchNode
for nodes
Metric data
The Elasticsearch integration collects these metrics for clusters:
Name | Relevant statistics | Description |
---|---|---|
| Minimum, Maximum | Indicates that all index shards are allocated to nodes in the cluster. |
| Minimum, Maximum | Indicates that the primary shards for all indices are allocated to nodes in a cluster, but the replica shards for at least one index are not. Single node clusters always initialize with this cluster status because there is no second node to which a replica can be assigned. You can either increase your node count to obtain a green cluster status, or you can use the Amazon ES API to set the |
| Minimum, Maximum | Indicates that the primary and replica shards of at least one index are not allocated to nodes in a cluster. For more information, see Amazon's documentation on Red Cluster Status. |
| Minimum, Maximum, Average | The number of nodes in the Amazon ES cluster. |
| Minimum, Maximum, Average | The total number of searchable documents across all indices in the cluster. |
| Minimum, Maximum, Average | The total number of deleted documents across all indices in the cluster. |
| Minimum, Maximum, Average | The maximum percentage of CPU resources used for data nodes in the cluster. |
| Minimum | The free space, in megabytes, for all data nodes in the cluster. |
| Minimum, Maximum | The total used space, in megabytes, for a cluster. |
| Maximum | Indicates whether your cluster is accepting or blocking incoming write requests. A value of 0 means that the cluster is accepting requests. A value of 1 means that it is blocking requests. |
| Maximum | The maximum percentage of the Java heap used for all data nodes in the cluster. |
| Minimum, Maximum | The number of failed automated snapshots for the cluster. A value of 1 indicates that no automated snapshot was taken for the domain in the previous 36 hours. |
| Minimum | The remaining CPU credits available for data nodes in the cluster. A CPU credit provides the performance of a full CPU core for one minute. This metrics is available only for the t2.micro.elasticsearch, t2.small.elasticsearch, and t2.medium.elasticsearch instance types. |
| Minimum | A health check for Kibana. A value of 1 indicates normal behavior. A value of 0 indicates that Kibana is inaccessible. In most cases, the health of Kibana mirrors the health of the cluster. |
| Minimum, Maximum | A value of 1 indicates that the KMS customer master key used to encrypt data at rest has been disabled. To restore the domain to normal operations, re-enable the key. |
| Minimum, Maximum | A value of 1 indicates that the KMS customer master key used to encrypt data at rest has been deleted or revoked its grants to Amazon ES. You can't recover domains that are in this state. If you have a manual snapshot, though, you can use it to migrate the domain's data to a new domain. |
| Sum | The number of HTTP requests made to the Elasticsearch cluster that included an invalid (or missing) host header. |
| Sum | The number of requests made to the Elasticsearch cluster. |
| Sum | The number of requests to a domain and the HTTP response code (2xx, 3xx, 4xx, 5xx) for each request. |
| Average | The maximum percentage of CPU resources used by the dedicated master nodes. We recommend increasing the size of the instance type when this metric reaches 60 percent. |
| Maximum | The maximum percentage of the Java heap used for all dedicated master nodes in the cluster. We recommend moving to a larger instance type when this metric reaches 85 percent. |
| Minimum | The remaining CPU credits available for dedicated master nodes in the cluster. A CPU credit provides the performance of a full CPU core for one minute. This metric is available only for the t2.micro.elasticsearch, t2.small.elasticsearch, and t2.medium.elasticsearch instance types. |
| Minimum | A health check for Failures mean that the master node stopped or is not reachable. They are usually the result of a network connectivity issue or AWS dependency problem. |
| Minimum, Maximum, Average | The latency, in seconds, for read operations on EBS volumes. |
| Minimum, Maximum, Average | The latency, in seconds, for write operations on EBS volumes. |
| Minimum, Maximum, Average | The throughput, in bytes per second, for read operations on EBS volumes. |
| Minimum, Maximum, Average | The throughput, in bytes per second, for write operations on EBS volumes. |
| Minimum, Maximum, Average | The number of pending input and output (I/O) requests for an EBS volume. |
| Minimum, Maximum, Average | The number of input and output (I/O) operations per second for read operations on EBS volumes. |
| Minimum, Maximum, Average | The number of input and output (I/O) operations per second for write operations on EBS volumes. |
The following metrics are collected for Elasticsearch clusters, and optionally for each instance or node in a domain as well:
Name | Relevant statistics | Description |
---|---|---|
| For nodes: Average For clusters: Average, Maximum | The average time, in milliseconds, that it takes a shard to complete an indexing operation. |
| For nodes: Average For clusters: Average, Maximum, Sum | The number of indexing operations per minute. |
| For nodes: Average For clusters: Average, Maximum | The average time, in milliseconds, that it takes a shard to complete a search operation. |
| For nodes: Average For clusters: Average, Maximum, Sum | The total number of search requests per minute for all shards on a node. |
| Minimum, Maximum, Average | The percentage of the instance's memory that is in use. |
| For nodes: Maximum For clusters: Sum, Maximum, Average | The number of times that "young generation" garbage collection has run. A large, ever-growing number of runs is a normal part of cluster operations. |
| For nodes: Maximum For clusters: Sum, Maximum, Average | The amount of time, in milliseconds, that the cluster has spent performing "young generation" garbage collection. |
| For nodes: Maximum For clusters: Sum, Maximum, Average | The number of times that "old generation" garbage collection has run. In a cluster with sufficient resources, this number should remain small and grow infrequently. |
| For nodes: Maximum For clusters: Sum, Maximum, Average | The amount of time, in milliseconds, that the cluster has spent performing "old generation" garbage collection. |
| For nodes: Maximum For clusters: Sum, Maximum, Average | The number of queued tasks in the force merge thread pool. If the queue size is consistently high, consider scaling your cluster. |
| For nodes: Maximum For clusters: Sum | The number of rejected tasks in the force merge thread pool. If this number continually grows, consider scaling your cluster. |
| For nodes: Maximum For clusters: Sum, Average | The size of the force merge thread pool. |
| For nodes: Maximum For clusters: Sum, Maximum, Average | The number of queued tasks in the index thread pool. If the queue size is consistently high, consider scaling your cluster. The maximum index queue size is 200. |
| For nodes: Maximum For clusters: Sum | The number of rejected tasks in the index thread pool. If this number continually grows, consider scaling your cluster. |
| For nodes: Maximum For clusters: Sum, Average | The size of the index thread pool. |
| For nodes: Maximum For clusters: Sum, Maximum, Average | The number of queued tasks in the search thread pool. If the queue size is consistently high, consider scaling your cluster. The maximum search queue size is 1000. |
| For nodes: Maximum For clusters: Sum | The number of rejected tasks in the search thread pool. If this number continually grows, consider scaling your cluster. |
| For nodes: Maximum For clusters: Sum, Average | The size of the search thread pool. |
| For nodes: Maximum For clusters: Sum, Maximum, Average | The number of queued tasks in the bulk thread pool. If the queue size is consistently high, consider scaling your cluster. |
| For nodes: Maximum For clusters: Sum | The number of rejected tasks in the bulk thread pool. If this number continually grows, consider scaling your cluster. |
| For nodes: Maximum For clusters: Sum, Average | The size of the bulk thread pool. |