You are getting error messages for the New Relic Kubernetes integration in your New Relic infrastructure logs.
Below are some solutions to the most common Kubernetes integration errors. These errors will show up in the standard (non-verbose) Infrastructure agent logs. If you need more detailed logs (for example, when working with New Relic support), see Kubernetes logging.
If the license you are using is invalid then you will see an error like this in the logs:
2018-04-09T14:20:17.750893186Z time="2018-04-09T14:20:17Z" level=errormsg="metric sender can't process 0 times" error="InventoryIngest: eventswere not accepted: 401 401 Unauthorized Invalid license key."
To resolve this problem make sure you specify a valid license key. The key should be surrounded with quotes and no leading or trailing spaces. Example:
- name: "NRIA_LICENSE_KEY"value: "1234567890abcdefghijklmnopqrstuvwxyz1234"
If the agent is not able to connect to New Relic servers you will see an error like the following in the logs:
2018-04-09T18:16:35.497195185Z time="2018-04-09T18:16:35Z" level=errormsg="metric sender can't process 1 times" error="Error sending events:Post https://staging-infra-api.newrelic.com/metrics/events/bulk:net/http: request canceled (Client.Timeout exceeded while awaiting headers)"
Depending on the exact nature of the error the message in the logs may differ.
To address this problem, see the New Relic networks documentation.
The Kubernetes integration requires
kube-state-metrics. If that is not found, you will see an error like the following in the newrelic-infra container logs:
2018-04-11T08:02:41.765236022Z time="2018-04-11T08:02:41Z" level=errormsg="executing data source" data prefix="integration/com.newrelic.kubernetes"error="exit status 1" plugin name=nri-kubernetes stderr="time=\"2018-04-11T08:02:41Z\"level=fatal msg=\"failed to discover kube-state-metrics endpoint,got error: no service found by label k8s-app=kube-state-metrics\"\n"
Common reasons for this error include:
kube-state-metricshas not been deployed into the cluster.
kube-state-metricsis deployed using a custom deployment.
- There are multiple versions of
kube-state-metricsrunning and the Kubernetes integration is not finding the correct one.
The Kubernetes integration automatically discovers
kube-state-metrics in your cluster using this logic:
- It looks for a
kube-state-metricsservice running on the
- If that is not found, it looks for a service tagged with label
The integration also requires the kube-state-metrics pod to have the label
k8s-app: kube-state-metrics or
app: kube-state-metrics. If neither of those are found, there will be a log entry like the following:
2018-04-11T09:25:00.825532798Z time="2018-04-11T09:25:00Z" level=errormsg="executing data source" data prefix="integration/com.newrelic.kubernetes"error="exit status 1" plugin name=nri-kubernetes stderr="time=\"2018-04-11T09:25:00Z\"level=fatal msg=\"failed to discover nodeIP with kube-state-metrics,got error: no pod found by label k8s-app=kube-state-metrics\"\n
To solve this issue, add the
k8s-app=kube-state-metrics label to the
If metrics for Kubernetes nodes, pods, and containers are showing but metrics for namespaces, deployments and
ReplicaSets are missing, the Kubernetes integration is not able to connect to
Indicators of missing namespace, deployment, and
- In the # of K8s objects chart, that data is missing.
- Queries for
K8sReplicasetSampledon't show any data.
There are few possible reasons for this:
kube-state-metricsservice has been customized to listen on port 80. If that is the case, you may see an error like the following in the
verboselogs:time="2018-04-04T09:35:47Z" level=error msg="executing data source"data prefix="integration/com.newrelic.kubernetes" error="exit status 1"plugin name=nri-kubernetes stderr="time=\"2018-04-04T09:35:47Z\"level=fatal msg=\"Non-recoverable error group: error querying KSM.Get http://kube-state-metrics.kube-system.svc.cluster.local:0/metrics:net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)\"\n"
This is a known problem that happens in some clusters where it takes too much time for
kube-state-metricsto collect all the cluster's information before sending to the integration.
As a workaround, increase the kube-state-metrics client timeout.
kube-state-metricsinstance is running behind kube-rbac-proxy. New Relic does not currently support this configuration. You may see an error like the following in the
verboselogs:time="2018-03-28T23:09:12Z" level=error msg="executing data source"data prefix="integration/com.newrelic.kubernetes" error="exit status 1"plugin name=nri-kubernetes stderr="time=\"2018-03-28T23:09:12Z\"level=fatal msg=\"Non-recoverable error group: error querying KSM.Get http://192.168.132.37:8443/metrics: net/http: HTTP/1.xtransport connection broken: malformed HTTP response \\\"\\\\x15\\\\x03\\\\x01\\\\x00\\\\x02\\\\x02\\\"\"\n"
The KSM payload is quite large, and the Kubernetes integration processing the date is being OOM-killed. Since the integration is not the main process of the container, the pod is not restarted. This situation can be spotted at the logs of the
newrelic-infrapod running in the same node of KSM:time="2020-12-10T17:40:44Z" level=error msg="Integration command failed" error="signal: killed" instance=nri-kubernetes integration=com.newrelic.kubernetes
As a workaround, increase the DaemonSet memory limits so the process is not killed.
Newrelic pods and newrelic service account are not deployed in the same namespace. This is usually because the current context specifies a namespace. If this is the case, you will see an error like the following:
time=\"2018-05-31T10:55:39Z\" level=panic msg=\"pods is forbidden: User \\\"system:serviceaccount:kube-system:newrelic\\\" cannot list pods at the cluster scope\"
To check to see if this is the case, run:
kubectl describe serviceaccount newrelic | grep Namespacekubectl get pods -l name=newrelic-infra --all-namespaceskubectl config get-contexts
To resolve this problem, change the namespace for the service account in the New Relic
DaemonSet YAML file to be the same as the namespace for the current context:
- kind: ServiceAccountname: newrelicnamespace: default---