New Relic Prometheus OpenMetrics integration (Kubernetes)

The New Relic Prometheus OpenMetrics integration scrapes Prometheus endpoints and sends the data to New Relic. The collected metrics can then be visualized in the Insights UI.

In a Kubernetes environment the Prometheus endpoints are auto-discovered in the same manner as the Prometheus Kubernetes Collector does it: the integration looks for the prometheus.io/scrape annotation or label (See scrape_enabled_label in the config options to change this). You can add additional static endpoints in the configuration.

New Relic has contributed the Prometheus integration to the open source community under an Apache 2.0 license.

Requirements

This integration supports Prometheus protocol version 2 and Kubernetes versions 1.9 or higher. The integration was tested using Kubernetes 1.9, 1.11 and 1.13 on kops, GKE and minikube, respectively. The following limits apply:

  • 50 attributes per metric
  • 50k unique timeseries per day (A timeseries is a single, unique combination of a metric name and any tags/attributes.)
  • 100k data points per minute (contact New Relic for a higher limit)

The Prometheus OpenMetrics integration allows scraping of up to 50 endpoints. If you hit this limit in your cluster, you can set the SCRAPE_ENABLED_LABEL to newrelic.com/prometheus-scrape and apply that label to the pods you want to scrape.

Prometheus OpenMetrics - Kubernetes
Example workflow: Here is an example of the workflow using the New Relic Prometheus OpenMetrics integration for Kubernetes.

Install the integration

To prevent your data from being duplicated, configure your New Relic Prometheus OpenMetrics integration only with one replica. Running two or more replicas will only cause your data to be duplicated.

To install the New Relic Prometheus OpenMetrics integration in a Kubernetes environment:

  1. Download the integration manifest YAML file:

    curl -O https://download.newrelic.com/infrastructure_agent/integrations/kubernetes/nri-prometheus-latest.yaml
  2. Edit the nri-prometheus-latest.yaml manifest file, and add a cluster name to identify your Kubernetes cluster (required) and your New Relic license key (required).

    env: - name: LICENSE_KEY 
           value: "<YOUR_LICENSE_KEY>"
    [...]
    config.yaml: |
      cluster_name: "<YOUR_CLUSTER_NAME>"     
  3. Optional: Specify which metrics you want to include. This can be controlled precisely according to the prefixes of the metrics and the labels. See the sections "ignore metrics" and "include specific metrics" in this document for more details.

  4. Deploy the integration in your Kubernetes cluster:

    kubectl apply -f nri-prometheus-latest.yaml
  5. To confirm that the deployment has been created successfully, look at the CURRENT replicas in the results generated by this command:

    kubectl get deployments nri-prometheus
  6. To confirm that the integration has been configured correctly, wait a few minutes, then go to insights.newrelic.com, and run this NRQL query to see if data has been reported:

    FROM Metric SELECT count(*) WHERE clusterName = 'YOUR_CLUSTER_NAME' since 1 hour ago

Configuration options

License key

We recommend to configure the license key as an environment variable. This provides a more secure environment, as this environment variable can be loaded from a Kubernetes secret. The environment variable is named LICENSE_KEY and it is required.

To find your license key: Go to Account dropdown > Account settings. The license key appears in the Account information section on the right side of the Summary page. For more information, see License key.

General configuration

The nri-prometheus-latest.yaml manifest file includes the nri-prometheus-cfg config map showing an example configuration. It can be used to configure the following parameters:

Key name Description

cluster_name

Required.

The name of the Kubernetes cluster. This value will be included as the "clusterName" attribute for all the metrics.

verbose

Stringified boolean.

  • true (default): Logs debugging information.
  • false: Only logs error messages.
targets Configuration of static endpoints to be scraped by the integration. It contains a list of objects. For more information about this structure, see Target configuration.

scrape_enabled_label

String. The Integration will check if the Kubernetes pod and service are annotated or have a label with this value to decide if it has to be scraped.

This is particularly useful when you want to limit the amount of data sent to New Relic (See the ignore metrics and include specific metrics sections for more options on filtering metrics). Since by default we use the same label prometheus uses for discovering scrapable targets, most of the exporter that you install will have this label already set.

To keep a fine-grained control on the targets you want the integration to scrape, you can set this option to some other value like newrelic/scrape and add the annotation/label newrelic/scrape: "true" to your kubernetes objects. Annotations take precedence over labels if both are set.

Default: "prometheus.io/scrape"

scrape_duration

How often should the scraper run. Increasing this value will lower memory usage. Decreasing it has the opposite effect.

The impact on memory usage is due to distributing the target fetching over the scrape interval to avoid querying (and buffering) all the data at once.

Default is 30s. Valid values include 1s, 15s, 30s, 1m, 5m, etc.

scrape_timeout

The HTTP client timeout when fetching data from endpoints.

Default is 5s. Valid values include 1s, 15s, 30s, 1m, 5m, etc.

require_scrape_enabled_label_for_nodes

Whether or not Kubernetes nodes need to be labeled to be scraped.

Default is true.

percentiles Histogram support is based on New Relic's guidelines for higher level metrics abstractions.
To better support visualization of this data, percentiles are calculated based on the histogram metrics and sent to New Relic.
Defaults are: 50, 95 and 99.
emitter_proxy

Proxy to use by the integration when submitting metrics. It should be in the format [scheme]://[domain]:[port].

This proxy won't be used when fetching metrics from the targets.

By default is empty, meaning that no proxy will be used.

emitter_ca_file

Certificate to add to the root CA that the emitter will use when verifying server certificates.

If left empty, TLS uses the host's root CA set

emitter_insecure_skip_verify

Wheter the emitter should skip TLS verification when submitting data.

Defaults to false.

Target discovery, specifying port and custom path

The integration automatically discovers which targets to scrape. You can use the prometheus.io/port and prometheus.io/path annotations/label (annotation has precedence over label) in your Kubernetes pods and services, to specify the port and endpoint path that should be used when constructing the target.

If prometheus.io/port is not present, the integration will try to scrape each Port defined for the service, in the case of pods it will try every ContainerPort defined.

if prometheus.io/path is not present, the integration will default to /metrics.

Example usage:

If you have a deployment in your cluster and the pods expose prometheus metrics on port 8080 and in the path my-metrics, it's just a matter of setting the labels prometheus.io/port: "8080" and prometheus.io/path: "my-metrics" in the PodSpec metadata of the deployment manifest. When the integration tries to retrieve the metrics from your pods it will do a request to http://<pod-ip>:8080/my-metrics

apiVersion: apps/v1
kind: Deployment
metadata:
  name: my-deployment
spec:
  replicas: 2
  selector:
    matchLabels:
      app: my-app
  template:
    metadata:
      labels:
        app: my-app
        prometheus.io/scrape: "true" 
        prometheus.io/port: "8080" 
        prometheus.io/path: "my-metrics"

Target configuration

If you want the target key in the configuration file to contain one or more objects, use the following structure in the YAML list:

Key name Description

description

A friendly description for the following URLs in this target.

urls A list of strings with the URLs to be scraped.

tls_config

Optional.

Authentication configuration used to send requests to all the configured. It supports TLS and Mutual TLS. For more information, see Mutual TLS authentication.

Configuration reload

To reload the configuration you have to restart the integration. We recommend that you do it by scaling the deployment down to 0 replicas and then back up to 1 again right:

$ kubectl scale deployment nri-prometheus --replicas=0

$ kubectl scale deployment nri-prometheus --replicas=1

Currently this integration does not automatically reload configuration when changes are done in the configuration file.

View and query your metrics

The collected metrics can be queried using NRQL in the Insights UI. All metrics are stored in the Metric type.

By default, the following attributes will be added to all the metrics:

  • integrationName: The name of this integration (nri-prometheus).
  • integrationVersion: The version of the integration; for example, 0.2.0.
  • scrapedEndpoint: The URL of the endpoint is being scraped.
  • metricName: The name of the metric itself.
  • nrMetricType: The type of the New Relic Metric type Gauges.
  • promMetricType: The metric type of the Prometheus metric.

If the scraper is running in Kubernetes, the following attributes will be added to all the metrics:

  • clusterName: The name of the cluster provided in the scraper configuration.
  • namespaceName: Name of the namespace.
  • nodeName: Name of the node where the pod we are scraping is running, if applicable.
  • podName: Name of the pod being scraped, if applicable.
  • deploymentName: Name of the deployment, if scraping a pod.
  • serviceName: Name of the service being scraped, if applicable.
  • The Kubernetes labels of the object we are scraping, prefixed by "label".

When you build queries, be aware that there is no linking between the metrics, entities, and attributes. Use the following NRQL queries to find out which metrics are available and which attributes are present on these metrics:

Get metric names

To get all metric names:

FROM Metric SELECT uniques(metricName)

To get metric names for a specific cluster, namespace, or pod:

FROM Metric SELECT uniques(metricName) WHERE clusterName='<cn>'
FROM Metric SELECT uniques(metricName) WHERE namespaceName='<ns>' 
FROM Metric SELECT uniques(metricName) WHERE podName='<pod>'
Get the attributes for a metric

To get all attributes for the selected metric:

FROM Metric SELECT keyset() WHERE metricName='<mn>'
Get the values for an attribute

The autocomplete will show all values of the attribute, regardless of the pod. To determine the attribute values for a specific pod:

FROM Metric SELECT uniques(<attribute>) WHERE metricName='<mn>' AND podName='<pod>'

Histograms and summaries

Histogram and summaries support was added in version 1.2.0 of The New Relic's Prometheus OpenMetrics integration, It is based on New Relic's guidelines for higher level metrics abstractions.

For summary types, quantiles are transformed into percentiles. A metric <basename>{quantile="0.3"} will be sent to New Relic as <basename>.percentiles and will have the dimention {percentile="30"}.

In terms of histograms, a bucket <basename>_bucket{le="42"} will be sent as <basename>_buckets and will have the dimention {histogram.bucket.upperBound="42"}.

To better support visualization of histograms, percentiles are calculated based on the histogram metrics and sent to New Relic. The calculated percentiles can be configured using the percentiles configuration option.

Build the query

Using the metric name and attributes retrieved above, you can now query your data. See the NRQL documentation on more information about facets, timeseries, and time selection.

Get metric values

To get raw metric values:

FROM Metric SELECT <metricname> WHERE <attribute>='<value>'
Get a graph of the metric

To get a graph of the metric (possible aggregators are average, min, max, sum):

FROM Metric SELECT <aggregator>(<metricname>) WHERE <attribute>='<value>' TIMESERIES
Example: Average memory usage for pods in a deployment

To view average memory usage for all pods in a deployment:

FROM Metric SELECT average(container_memory_usage_bytes) WHERE deploymentName='my-app-deployment' AND namespaceName='default'
Example: Connected Redis clients per pod in namespace

This example assumes that you have Redis pods with the Redis exporter installed. To view the number of connected Redis clients per pod in the default namespace:

FROM Metric SELECT latest(redis_connected_clients) WHERE namespaceName='default' FACET podName TIMESERIES
Query counter metrics

Currently the integration calculates the deltas for counter metrics. This is why queries on counter metrics will show the deltas of the counter instead of the absolute value of the counter.

Troubleshooting

Here are some troubleshooting tips when using the Prometheus OpenMetrics integration for Kubernetes.

Basic troubleshooting

If you are having problems with the integration:

  1. Check if the Prometheus OpenMetrics integration is running:

    kubectl describe pod -l "app=nri-prometheus"
  2. Check the Ready field for the pod.
  3. If the pod is not ready, check the Events.
No data appears in Insights

If no data appears in the the Insights UI:

  1. Run this NRQL query:

    kubectl logs deploy/nri-prometheus | grep "error emitting metrics"
  2. Check whether the log contains this:

    metrics api responded with status code 403

    If yes, check the LICENSE_KEY in your nri-prometheus-latest.yaml manifest file.

Review rate limit errors (NrIntegrationError)

Basic validation of the metrics is provided by the NewRelic platform at the time of submission. More extensive validation will be performed asynchronously by the platform when processing the metrics. Errors found during this asynchronous validation will be put into an NrIntegrationError event in your New Relic account.

If you exceed the limits defined by the platform, New Relic will apply rate limits to your account and create an associated NrIntegrationError event. To examine the errors in Insights, use the following NRQL query:

FROM NrIntegrationError SELECT * WHERE newRelicFeature = 'Metric API'
Identify scraped endpoints

To get the list of scraped endpoints:

kubectl logs deploy/nri-prometheus | grep notification
Get integration logs

To get the logs of the Prometheus OpenMetrics integration:

kubectl logs deploy/nri-prometheus
kubectl logs deploy/nri-prometheus | grep -v "level=debug"
Get integration metrics

By default the scraper will also scrape its own metrics. The following metrics can give some insight into what is happening inside the scraper:

  • nr_stats_targets_total: The total number of scraped endpoints.
  • nr_stats_targets_failed_fetch: The number of endpoints where scraping is failing.
  • nr_stats_metrics_total: The total number of scraped metrics.
  • nr_stats_metrics_total_timeseries: The total number of scraped time series.
View metric data sent to Metric API

To see the exact data that is being sent to the Metric API, set the EMITTERS environment variable to "api,stdout".

Update the integration

To update the integration, follow standard installation procedures, and reapply the nri-prometheus-latest.yaml manifest file.

The integration logs its version when it starts up. To determine the running version:

kubectl logs deploy/nri-prometheus | grep "Integration version"

Example output:

time="2019-02-26T09:21:21Z" level=info msg="Starting New Relic's Prometheus OpenMetrics Integration version 1.0.0"

Scaling the integration

Recommendation: Always run the scraper with 1 replica. Adding more replicas will result in duplicated data.

If the CPU and memory limits are not sufficient, this can result in restarts and gaps in the data.

To check the status and restart events for the scraper:

kubectl describe pod -l "app=nri-prometheus"

When running this integration with 500K data points per minute, you will need to set the CPU limit to 1 core and the memory limit to 1Gb.

Metric transformations

The New Relic Prometheus OpenMetrics integration provides a few controls to transform the Prometheus metrics before sending them to New Relic. You can define these transformations in the integration config file. The transformation are performed for all endpoints.

The nri-prometheus-latest.yaml manifest file includes the nri-prometheus-cfg config map showing an example configuration. The transformations are executed in the following order:

  1. Ignore metrics.
  2. Add attributes.
  3. Rename attributes.
  4. Copy attributes.

Rename attributes

Not all Prometheus endpoints have consistent naming. You can rename the attributes as needed. For example:

Example: Configuration

To rename the table attribute to tableName and the under_score attribute to CamelCase for metrics that start with mysql_:

rename_attributes:
  - metric_prefix: "mysql_" 
    attributes:
      table: "tableName"
      under_score: "CamelCase"

Example: Input

mysql_info_schema_table_rows{schema="sys",table="host_summary"} 123 another_metric{table="first"} 800

Example: Output

mysql_info_schema_table_rows{schema="sys",tableName="host_summary"} 123 another_metric{table="first"} 800

Example: Copy attributes

Some Prometheus endpoints provide an _info or _static metric containing metadata about the service, such as the version. It can be helpful to have this attribute on all metrics for that service. This transformation allows you to copy attributes from a source metric to a set of target metrics.

You can only copy attributes between metrics in the same endpoint.

Example: Configuration

To copy the innodb_version and version attributes from the mysql_version_info metric to all metrics starting with mysql_:

copy_attributes:
  - from_metric: "mysql_version_info"
    to_metrics:
      - "mysql_" 
    attributes: 
      - "innodb_version"
      - "version"

Example: Input

# HELP mysql_version_info MySQL version and distribution. mysql_version_info{innodb_version="5.7.14",version="5.7.14",version_comment="MySQL Community Server (GPL)"} 1
# HELP mysql_global_variables_slave_transaction_retries Generic gauge metric from SHOW GLOBAL VARIABLES. mysql_global_variables_slave_transaction_retries 10

Example: Output

mysql_global_variables_slave_transaction_retries{innodb_version="5.7.14",version="5.7.14"} 10

Ignore metrics

You should try to avoid sending data that's not relevant to your monitoring needs. If you set the integration to scrape all the available targets and to sent all the data that's exposed from those targets, you might end up reaching the platform limits (which you can find in the requirements section) or increasing your bill charges.

You should try to avoid sending data that's not relevant to your monitoring needs. If you set the integration to scrape all the available targets and to sent all the data that's exposed from those targets, you might end up reaching the platform limits (which you can find in the requirements section) or increasing your bill charges.

Our recommended approach is to use the integration filtering capabilities to control the amount of data sent. Explore your data and refine your filters to scrape only relevant targets and send useful metrics.

To ignore unwanted metrics, use following transformation.

Here we'll explain how to use the ignore_metrics option to filter out unwanted metrics from a target. See the scrape_enabled_label configuration option if you want to filter the targets instead of the metrics.

To ignore unwanted metrics, use the following transformation.

Example: Configuration

To drop all metrics that start with go_ or process_:

ignore_metrics:
  - prefixes:
    - "go_"
    - "process_"

Example: Input

go_goroutines 13
process_virtual_memory_bytes 2.062336e+07
mysql_global_status_commands_total{command="ha_close"} 0
mysql_global_status_commands_total{command="ha_open"} 0

This is taken from the MySQL exporter. Besides the MySQL metrics, it also exposes metrics about the exporter that might not be of interest to you.

Example: Output

mysql_global_status_commands_total{command="ha_close"} 0 
mysql_global_status_commands_total{command="ha_open"} 0 

Include only specific metrics

If you only want to include specific metrics, you can use the except list under the ignore_metrics section. As the name implies, this will ignore all the metrics except the ones that contain the with the given prefixes.

Example: Configuration

To drop all metrics except kube_hpa_:

ignore_metrics:
  - except:
    - kube_hpa_

Transformation configuration

This is the full configuration file containing all of these examples:

transformations:
 - description: "Transformation for MySQL exporter"
   rename_attributes:
     - metric_prefix: "mysql_"
       attributes:
         table: "tableName"
         under_score: "CamelCase"
   copy_attributes:
     - from_metric: "mysql_version_info"
       to_metrics:
         - "mysql_"
       attributes:
         - "innodb_version"
         - "version"
   ignore_metrics:
     - prefixes:
       - "go_"
       - "process_"

Mutual TLS authentication

The New Relic Prometheus integration provides the ability to configure mutual TLS authentication for the endpoints that require this feature. This can be configured in the integration's config file.

Recommendation: Put the CA bundle and the key and cert files in a secret, and mount them in the Prometheus metrics integration's container.

MTLS authentication is limited to a static list of URLs. To configure endpoints that require MTLS authentication, follow this example:

targets:
- description: "Secure etcd example"
  urls: ["https://192.168.3.1:2379", "https://192.168.3.2:2379"]
  tls_config:
    ca_file_path: "/etc/etcd/etcd-client-ca.crt"
    cert_file_path: "/etc/etcd/etcd-client.crt"
    key_file_path: "/etc/etcd/etcd-client.key"
transformations:
  ...

Specify the path and port for an endpoint

The Prometheus OpenMetrics integration takes into account the prometheus.io/path and prometheus.io/port labels when auto-discovering endpoints.

If a service is not running on the default /metrics path, add a label to the pod prometheus.io/path=/foo/bar .

Uninstall

To uninstall the integration, execute the following command:

kubectl delete -f nri-prometheus-latest.yaml

For more help

Recommendations for learning more:

  • Browse New Relic's Explorers Hub for community discussions about New Relic's APIs.
  • Use your preferred search engine to find other New Relic resources.