Not seeing control plane data

Problem

You have completed the installation procedure for New Relic's Kubernetes integration, you are seeing Kubernetes data in your New Relic account but there is no data from any of the control plane components.

Solution

Check that the master nodes have the correct labels

Execute the following commands to manually find the master nodes:

kubectl get nodes -l node-role.kubernetes.io/master=""
kubectl get nodes -l kubernetes.io/role="master"

If the master nodes follow the labeling convention defined in the discovery of master nodes and control plane components documentation section, you should get some output like:

NAME                         STATUS  ROLES   AGE   VERSION
ip-10-42-24-4.ec2.internal   Ready   master  42d   v1.14.8

If no nodes are found, there are two scenarios:

Your master nodes don’t have the required labels that identify them as masters, in this case you need to add both labels to your master nodes.

You’re in a managed cluster and your provider is handling the master nodes for you. In this case there is nothing you can do, since your provider is limiting the access to those nodes.

Check that the integration is running on the master nodes

Replace the placeholder in the following command with one of the node names returned in the previous step to get an integration pod running on a master node:

kubectl get pods --field-selector spec.nodeName=NODE_NAME -l name=newrelic-infra --all-namespaces

The next command is the same, just that it selects the node for you:

kubectl get pods --field-selector spec.nodeName=$(kubectl get nodes -l node-role.kubernetes.io/master="" -o jsonpath="{.items[0].metadata.name}") -l name=newrelic-infra --all-namespaces

If everything is correct you should get some output like:

NAME                   READY   STATUS    RESTARTS   AGE
newrelic-infra-whvzt   1/1     Running   0          6d20h

If the integration is not running on your master nodes, check that the daemonset has all the desired instances running and ready.

kubectl get daemonsets -l app=newrelic-infra --all-namespaces
A healthy daemonset should return something similar to:
NAMESPACE  NAME            DESIRED  CURRENT  READY  UP-TO-DATE  AVAILABLE  NODE SELECTOR  AGE
default    newrelic-infra  4        4        4      4           4          <none>         27d
Check that the control plane components have the required labels

Refer to the discovery of master nodes and control plane components documentation section and look for the labels the integration uses to discover the components. Then run the following commands to see if there are any pods with such labels and the nodes where they are running:

kubectl get pods -l k8s-app=kube-apiserver --all-namespaces

If there is component with the given label you should see something like:

NAMESPACE    NAME                                        READY  STATUS   RESTARTS  AGE
kube-system  kube-apiserver-ip-10-42-24-42.ec2.internal  1/1    Running  3         49d

The same should be done with the rest of the components:

kubectl get pods -l k8s-app=etcd-manager-main --all-namespaces
kubectl get pods -l k8s-app=kube-scheduler --all-namespaces
kubectl get pods -l k8s-app=kube-kube-controller-manager --all-namespaces
Retrieve the verbose logs of one of the integrations running on a master node and check for the control plane components jobs

To retrieve the logs, follow the instructions on get logs from pod running on a master node. The integration logs for every component the following message “Running job: COMPONENT_NAME. Ex:

Running job: scheduler
Running job: etcd
Running job: controller-manager
Running job: api-server

If you didn’t specify the ETCD_TLS_SECRET_NAME configuration option you’ll find the following message in the logs:

Skipping job creation for component etcd: etcd requires TLS configuration, none given

If any error occurs while querying the metrics of any component it will be logged after the Running job message.

Manually query the metrics of the components

Refer to the discovery of master nodes and control plane components documentation section to get the endpoint of the control plane component you want to query. With the endpoint we can use the integration pod that’s running on the same node as the component to query. The following are examples on how to query the Kubernetes scheduler:

kubectl exec -ti POD_NAME -- wget -O - localhost:10251/metrics
The following command does the same, but also chooses the pod for you:
kubectl exec -ti $(kubectl get pods --all-namespaces --field-selector spec.nodeName=$(kubectl get nodes -l node-role.kubernetes.io/master="" -o jsonpath="{.items[0].metadata.name}") -l name=newrelic-infra -o jsonpath="{.items[0].metadata.name}") -- wget -O - localhost:10251/metrics

If everything is correct you should get some metrics on the Prometheus format, something like:

Connecting to localhost:10251 (127.0.0.1:10251)
# HELP apiserver_audit_event_total Counter of audit events generated and sent to the audit backend.
# TYPE apiserver_audit_event_total counter
apiserver_audit_event_total 0
# HELP apiserver_audit_requests_rejected_total Counter of apiserver requests rejected due to an error in audit logging backend.
# TYPE apiserver_audit_requests_rejected_total counter
apiserver_audit_requests_rejected_total 0
# HELP apiserver_client_certificate_expiration_seconds Distribution of the remaining lifetime on the certificate used to authenticate a request.
# TYPE apiserver_client_certificate_expiration_seconds histogram
apiserver_client_certificate_expiration_seconds_bucket{le="0"} 0
apiserver_client_certificate_expiration_seconds_bucket{le="1800"} 0
apiserver_client_certificate_expiration_seconds_bucket{le="3600"} 0

For more help

If you need more help, check out these support and learning resources: