You are getting error messages for the New Relic Kubernetes integration in your New Relic infrastructure logs.
Below are some solutions to the most common Kubernetes integration errors. These errors will show up in the standard (non-verbose) Infrastructure agent logs. If you need more detailed logs (for example, when working with New Relic support), see Kubernetes logging.
- Invalid New Relic license
If the license you are using is invalid then you will see an error like this in the logs:
2018-04-09T14:20:17.750893186Z time="2018-04-09T14:20:17Z" level=error msg="metric sender can't process 0 times" error="InventoryIngest: events were not accepted: 401 401 Unauthorized Invalid license key."
To resolve this problem make sure you specify a valid license key. The key should be surrounded with quotes and no leading or trailing spaces. Example:
- name: "NRIA_LICENSE_KEY" value: "1234567890abcdefghijklmnopqrstuvwxyz1234"
- Error sending events
If the agent is not able to connect to New Relic servers you will see an error like the following in the logs:
2018-04-09T18:16:35.497195185Z time="2018-04-09T18:16:35Z" level=error msg="metric sender can't process 1 times" error="Error sending events: Post https://staging-infra-api.newrelic.com/metrics/events/bulk: net/http: request canceled (Client.Timeout exceeded while awaiting headers)"
Depending on the exact nature of the error the message in the logs may differ.
To address this problem, see the New Relic networks documentation.
- Failed to discover kube-state-metrics
The Kubernetes integration requires
kube-state-metrics. If that is not found, you will see an error like the following in the newrelic-infra container logs:
2018-04-11T08:02:41.765236022Z time="2018-04-11T08:02:41Z" level=error msg="executing data source" data prefix="integration/com.newrelic.kubernetes" error="exit status 1" plugin name=nri-kubernetes stderr="time=\"2018-04-11T08:02:41Z\" level=fatal msg=\"failed to discover kube-state-metrics endpoint, got error: no service found by label k8s-app=kube-state-metrics\"\n"
Common reasons for this error include:
kube-state-metricshas not been deployed into the cluster.
kube-state-metricsis deployed using a custom deployment.
- There are multiple versions of
kube-state-metricsrunning and the Kubernetes integration is not finding the correct one.
The Kubernetes integration automatically discovers
kube-state-metricsin your cluster using this logic:
- It looks for a
kube-state-metricsservice running on the
- If that is not found, it looks for a service tagged with label
The integration also requires the kube-state-metrics pod to have the label
app: kube-state-metrics. If neither of those are found, there will be a log entry like the following:
2018-04-11T09:25:00.825532798Z time="2018-04-11T09:25:00Z" level=error msg="executing data source" data prefix="integration/com.newrelic.kubernetes" error="exit status 1" plugin name=nri-kubernetes stderr="time=\"2018-04-11T09:25:00Z\" level=fatal msg=\"failed to discover nodeIP with kube-state-metrics, got error: no pod found by label k8s-app=kube-state-metrics\"\n
To solve this issue, add the
k8s-app=kube-state-metricslabel to the
- Missing metrics for Namespaces, Deployments, and ReplicaSets
If metrics for Kubernetes nodes, pods, and containers are showing but metrics for namespaces, deployments and
ReplicaSetsare missing, the Kubernetes integration is not able to connect to
- Indicators of missing data
Indicators of missing namespace, deployment, and
In the # of K8s objects chart, that data is missing.
K8sReplicasetSampledon't show any data.
There are few possible reasons for this:
kube-state-metricsservice has been customized to listen on port 80. If that is the case, you may see an error like the following in the
time="2018-04-04T09:35:47Z" level=error msg="executing data source" data prefix="integration/com.newrelic.kubernetes" error="exit status 1" plugin name=nri-kubernetes stderr="time=\"2018-04-04T09:35:47Z\" level=fatal msg=\"Non-recoverable error group: error querying KSM. Get http://kube-state-metrics.kube-system.svc.cluster.local:0/metrics: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)\"\n"
This is a known problem that happens in some clusters where it takes too much time for
kube-state-metricsto collect all the cluster's information before sending to the integration.
As a workaround, increase the kube-state-metrics client timeout.
kube-state-metricsinstance is running behind kube-rbac-proxy. New Relic does not currently support this configuration. You may see an error like the following in the
time="2018-03-28T23:09:12Z" level=error msg="executing data source" data prefix="integration/com.newrelic.kubernetes" error="exit status 1" plugin name=nri-kubernetes stderr="time=\"2018-03-28T23:09:12Z\" level=fatal msg=\"Non-recoverable error group: error querying KSM. Get http://192.168.132.37:8443/metrics: net/http: HTTP/1.x transport connection broken: malformed HTTP response \\\"\\\\x15\\\\x03\\\\x01\\\\x00\\\\x02\\\\x02\\\"\"\n"
- The KSM payload is quite large, and the Kubernetes integration processing the date is being OOM-killed. Since the integration is not the main process of the container, the pod is not restarted. This situation can be spotted at the logs of the
newrelic-infrapod running in the same node of KSM:
time="2020-12-10T17:40:44Z" level=error msg="Integration command failed" error="signal: killed" instance=nri-kubernetes integration=com.newrelic.kubernetesAs a workaround, increase the DaemonSet memory limits so the process is not killed.
- Cannot list pods at the cluster scope
Newrelic pods and newrelic service account are not deployed in the same namespace. This is usually because the current context specifies a namespace. If this is the case, you will see an error like the following:
time=\"2018-05-31T10:55:39Z\" level=panic msg=\"p ods is forbidden: User \\\"system:serviceaccount:kube-system:newrelic\\\" cannot list pods at the cluster scope\"
To check to see if this is the case, run:
kubectl describe serviceaccount newrelic | grep Namespace kubectl get pods -l name=newrelic-infra --all-namespaces kubectl config get-contexts
To resolve this problem, change the namespace for the service account in the New Relic
DaemonSetYAML file to be the same as the namespace for the current context:
- kind: ServiceAccount name: newrelic namespace: default ---