If you’re a developer running apps in Kubernetes, you can use New Relic to understand how Kubernetes infrastructure affects your OpenTelemetry-instrumented applications.
After completing the steps below, you can use the New Relic UI to correlate application-level metrics from OpenTelemetry with Kubernetes infrastructure metrics. This will help you view the entire landscape of your telemetry data and work across teams to get faster mean time to resolution (MTTR) for issues in your Kubernetes environment.
The steps in this guide enable your application to inject infrastructure-specific metadata into the telemetry data. The result is that the New Relic UI is populated with actionable information. Here are the steps you'll take to get this going:
- In each application container, define an environment variable to send telemetry data to the collector
- Deploy the OpenTelemetry Collector as a
DaemonSetin agent mode with
k8sattributesprocessors to inject relevant metadata (cluster, deployment, and namespace names)
To be successful with the steps below, you should already be familiar with OpenTelemetry and Kubernetes, and have done the following:
- Created the following environment variables:
OTEL_EXPORTER_OTLP_ENDPOINT(New Relic endpoint for your region or purpose)
- Installed the New Relic Kubernetes integration in your cluster
- Instrumented your applications with OpenTelemetry, and successfully sent data to New Relic via OpenTelemetry Protocol (OTLP)
If you have general questions about using collectors with New Relic, see our Introduction to OpenTelemetry Collector with New Relic.
To set this up, you need to add a custom snippet to the
env stanza of your Kubernetes YAML file. We have an example below that shows the snippet for a sample frontend microservice (
Frontend.yaml). The snippet includes two sections that do the following:
- Section 1: Ensure that the telemetry data is sent to the collector. This sets the environment variable
OTEL_EXPORTER_OTLP_ENDPOINTwith the host IP. It does this by calling the downward API to pull the host IP.
- Section 2: Attach infrastructure-specific metadata. To do this, we capture
metadata.uidusing the downward API and add it to the
OTEL_RESOURCE_ATTRIBUTESenvironment variable. This environment variable is used by the OpenTelemetry Collector’s
k8sattributesprocessors to add additional infrastructure-specific context to telemetry data.
For each microservice instrumented with OpenTelemetry, add the highlighted lines below to your manifest’s
# Frontend.yamlapiVersion: apps/v1kind: Deployment...spec:containers:- name: yourfrontendserviceimage: yourfrontendservice-betaenv:# Section 1: Ensure that telemetry data is sent to the collector- name: HOST_IPvalueFrom:fieldRef:fieldPath: status.hostIP# This is picked up by the opentelemetry sdks- name: OTEL_EXPORTER_OTLP_ENDPOINTvalue: "http://$(HOST_IP):55680"# Section 2: Attach infrastructure-specific metadata# Get pod ip so that k8sattributes can tag resources- name: POD_NAMEvalueFrom:fieldRef:fieldPath: metadata.name- name: POD_UIDvalueFrom:fieldRef:fieldPath: metadata.uid# This is picked up by the resource detector- name: OTEL_RESOURCE_ATTRIBUTESvalue: service.instance.id=$(POD_NAME),k8s.pod.uid=$(POD_UID)”
We recommend you deploy the collector as an agent on every node within a Kubernetes cluster. The agent can receive telemetry data, and enrich telemetry data with metadata. For example, the collector can add custom attributes or infrastructure information through processors, as well as handle batching, retry, compression and additional advanced features that are handled less efficiently at the client instrumentation level.
For help configuring the collector, see the sample collector configuration file below, along with the sections about setting up these options:
- OTLP exporter
- batch processor
- resourcedetection processor
- k8sattributes processor: general
- k8sattributes processor: RBAC
- k8sattributes processor: filters
receivers:otlp:protocols:grpc:processors:batch:resource:attributes:- key: host.idfrom_attribute: host.nameaction: upsertresourcedetection:detectors: [ gke, gce ]k8sattributes:auth_type: "serviceAccount"passthrough: falsefilter:node_from_env_var: KUBE_NODE_NAMEextract:metadata:- k8s.pod.name- k8s.pod.uid- k8s.deployment.name- k8s.cluster.name- k8s.namespace.name- k8s.node.name- k8s.pod.start_timepod_association:- from: resource_attributename: k8s.pod.uidexporters:otlp:endpoint: $OTEL_EXPORTER_OTLP_ENDPOINTheaders:api-key: $NEW_RELIC_API_KEYlogging:logLevel: DEBUGservice:pipelines:metrics:receivers: [ otlp ]processors: [ resourcedetection, k8sattributes, resource, cumulativetodelta, batch ]exporters: [ otlp ]traces:receivers: [ otlp ]processors: [ resourcedetection, k8sattributes, resource, batch ]exporters: [ otlp ]logs:receivers: [ otlp ]processors: [ resourcedetection, k8sattributes, resource, batch ]exporters: [ otlp ]
First, add an OTLP exporter to your OpenTelemetry Collector configuration YAML file along with your New Relic as a header.
exporters:otlp:endpoint: $OTEL_EXPORTER_OTLP_ENDPOINTheaders: api-key: $NEW_RELIC_API_KEY
The batch processor accepts spans, metrics, or logs, and places them into batches to make it easier to compress the data and reduce the number of outgoing requests from the collector.
resourcedetection processor gets host-specific information to add additional context to the telemetry data being processed through the collector. In this example, we use Google Kubernetes Engine (GKE) and Google Compute Engine (GCE) to get Google Cloud-specific metadata, including:
- cloud.provider ("gcp")
- cloud.platform ("gcp_compute_engine")
processors:resourcedetection:Detectors: [ gke, gce ]
When we run the
k8sattributes processor as part of the OpenTelemetry Collector running as an agent, it detects the IP addresses of pods sending telemetry data to the OpenTelemetry Collector agent, using them to extract pod metadata. Below is a basic Kubernetes manifest example with only a processors section. To deploy the OpenTelemetry Collector as a
DaemonSet, read this comprehensive manifest example.
processors:k8sattributes:auth_type: "serviceAccount"passthrough: falsefilter:node_from_env_var: KUBE_NODE_NAMEextract:metadata:- k8s.pod.name- k8s.pod.uid- k8s.deployment.name- k8s.cluster.name- k8s.namespace.name- k8s.node.name- k8s.pod.start_timepod_association:- from: resource_attributename: k8s.pod.uid
You need to add configurations for role-based access control (RBAC). The
k8sattributes processor needs
list permissions for pods and namespaces resources included in the configured filters. See this example of how to configure role-based access control (RBAC) for
ClusterRole to give a
ServiceAccount the necessary permissions for all pods and namespaces in the cluster.
When running the collector as an agent, you should apply a discovery filter so that the processor only discovers pods from the same host that it is running on. If you don’t use a filter, resource usage can be unnecessarily high, especially on very large clusters. Once the filter is applied, each processor will only query the Kubernetes API for pods running on its own node.
To set the filter, use the downward API to inject the node name as an environment variable in the pod
env section of the OpenTelemetry Collector agent configuration YAML file (see GitHub for an example). This will inject a new environment variable to the OpenTelemetry Collector agent’s container. The value will be the name of the node the pod was scheduled to run on.
spec:containers:- env:- name: KUBE_NODE_NAMEvalueFrom:fieldRef:apiVersion: v1fieldPath: spec.nodeName
Then, you can filter by the node with the
If you have successfully linked your OpenTelemetry data with your Kubernetes data, you should be able to see Kubernetes attributes like
k8s.deployment.name in your spans within the distributed tracing UI.
Click to enlarge the image:
Now that you've connected your OpenTelemetry-instrumented apps with Kubernetes, check out our best practices guide for tips to improve your use of OpenTelemetry and New Relic.
You can also check out this blog post, Correlate OpenTelemetry traces, metrics, and logs with Kubernetes performance data for more information on the steps provided above.