Kubernetes integration: install and configure

The easiest way to install the Kubernetes integration is to use our automated installer to generate a manifest. It bundles not just the integration DaemonSets, but also other New Relic Kubernetes configurations, like Kubernetes events, Prometheus OpenMetrics, and New Relic log monitoring.

You can use the automated installer for servers, VMs, and unprivileged environments. The installer can also help you with managed services or platforms after you review a few preliminary notes. We also have separate instructions if you need a custom manifest or prefer to do a manual unprivileged installation.

If your New Relic account is in the EU region, access the installer from one.eu.newrelic.com.

Installs for managed services and platforms

Before starting our automated installer, check out these notes for your managed services or platforms:

The Kubernetes integration monitors worker nodes. In Amazon EKS, master nodes are managed by Amazon and abstracted from the Kubernetes platforms.

Before starting our automated installer to deploy the Kubernetes integration in Amazon EKS, make sure you are using the version of kubectl provided by AWS.

The Kubernetes integration monitors worker nodes. In GKE, master nodes are managed by Google and abstracted from the Kubernetes platforms.

Before starting our automated installer to deploy the Kubernetes integration on GKE, ensure you have sufficient permissions:

  1. Go to https://console.cloud.google.com/iam-admin/iam and find your username. Click edit.
  2. Ensure you have permissions to create Roles and ClusterRoles: If you are not sure, add the Kubernetes Engine Cluster Admin role. If you cannot edit your user role, ask the owner of the GCP project to give you the necessary permissions.

  3. Ensure you have a RoleBinding that grants you the same permissions to create Roles and ClusterRoles:

    kubectl create clusterrolebinding YOUR_USERNAME-cluster-admin-binding --clusterrole=cluster-admin --user=YOUR_GCP_EMAIL

    Creating a RoleBinding is necessary because of a known RBAC issue in Kubernetes and Kubernetes Engine versions 1.6 or higher. For more information, see Google Cloud's documentation on defining permissions in a role.

To deploy the Kubernetes integration with OpenShift:

  1. Add the newrelic service account to your privileged Security Context Constraints:

    oc adm policy add-scc-to-user privileged \
    system:serviceaccount:<namespace>:newrelic
            
  2. Complete the steps in our automated installer.
  3. If you're using signed certificates, make sure they are properly configured by using the following variables to set the .pem file:

    - name: NRIA_CA_BUNDLE_DIR 
      value: YOUR_CA_BUNDLE_DIR 
    - name: NRIA_CA_BUNDLE_FILE  
      value: YOUR_CA_BUNDLE_NAME     
  4. Save your changes.

The Kubernetes integration monitors worker nodes. In Azure Kubernetes Service, master nodes are managed by Azure and abstracted from the Kubernetes platforms.

To deploy in Azure Kubernetes Service (AKS), complete the steps in our automated installer.

To deploy in PKS, we recommend that you use the automated installer, or you can follow the manual instructions provided in Install the Kubernetes integration using Helm.

Custom manifest

If the Kubernetes automated installer doesn't provide the settings you need, you can download our manifest template and install the integration manually.

To activate the Kubernetes integration, you must deploy the newrelic-infra agent onto a Kubernetes cluster as a DaemonSet:

  1. Install kube-state-metrics and get it running on the cluster. For example:

    curl -L -o kube-state-metrics-1.9.5.zip https://github.com/kubernetes/kube-state-metrics/archive/v1.9.5.zip && unzip kube-state-metrics-1.9.5.zip && kubectl apply -f kube-state-metrics-1.9.5/examples/standard
  2. Download the manifest file:

    curl -O https://download.newrelic.com/infrastructure_agent/integrations/kubernetes/newrelic-infrastructure-k8s-latest.yaml
    
  3. In the manifest, add your New Relic license key and a cluster name to identify your Kubernetes cluster. Both values are required.

    Recommendation: Do not change the NRIA_PASSTHROUGH_ENVIRONMENT or NRIA_DISPLAY_NAME value in your manifest.

    YOUR_CLUSTER_NAME is your cluster’s id in New Relic’s entity explorer. It doesn’t need to match the name of the cluster running in your environment.

    env:
      - name: NRIA_LICENSE_KEY
        value: YOUR_LICENSE_KEY
      - name: CLUSTER_NAME
        value: YOUR_CLUSTER_NAME
  4. Review the Configure section of this docs in case you need to adapt the manifest to fit your environment.

  5. Confirm that kube-state-metrics is installed.

    kubectl get pods --all-namespaces | grep kube-state-metrics
  6. Create the DaemonSet:

    kubectl create -f newrelic-infrastructure-k8s-latest.yaml
  7. Confirm that the DaemonSet has been created successfully by looking for newrelic-infra in the results generated by this command:

    kubectl get daemonsets

To confirm that the integration is working: wait a few minutes, then look for data in the New Relic Kubernetes cluster explorer.

If you don't see data, review the configuration procedures again, then follow the troubleshooting procedures.

In the future, the number of labels collected on Kubernetes objects will be limited per object type (containers, pods, nodes, etc.) . If objects have labels above the limit, you will be able to configure important labels that should always be sent to New Relic. When the limitation is in place, this documentation will be updated.

Make sure New Relic pods can be scheduled

Some of the New Relic pods are set up as DaemonSet in the manifest file so that they can run on every host. These include newrelic-infrastructure and newrelic-logging. In rare circumstances, it is possible for other pods to be scheduled first and starve the New Relic pods of resources. Since each of these pods have to run on a specific host, they will stay in pending status until that host has enough resources, even if there are other hosts available. This could end up occurring for long periods of time and result in reporting gaps.

To prevent this scenario you can configure the Kubernetes scheduler to give New Relic pods a higher priority. Using the default scheduler:

  1. Ensure kube-scheduler flag disablePreemption is not set to true (by default it is false).
  2. Create a PriorityClass for the New Relic DaemonSet pods.
    1. Set the appropriate priority value, which should generally be higher than your other pods.
    2. preemptionPolicy is set to PreemptLowerPriority by default. This allows New Relic pods assigned this priority class to remove lower-priority pods that are taking up resources.
  3. Edit the manifest file to add priorityClassName to any DaemonSet specs. In the example below, the highlighted line sets the priority class for newrelic-infrastructure:

    apiVersion: apps/v1
    kind: DaemonSet
    metadata:
      namespace: default
      labels:
        app: newrelic-infrastructure
        chart: newrelic-infrastructure-1.0.0
        release: nri-bundle
        mode: privileged
      name: nri-bundle-newrelic-infrastructure
    spec:
      priorityClassName: your-priority-class
      ...
  4. If you have already deployed the New Relic pods, re-deploy them and confirm they have been created:

    kubectl delete -f newrelic-infrastructure-k8s-latest.yaml
    kubectl create -f newrelic-infrastructure-k8s-latest.yaml
    kubectl get daemonsets
    

Unprivileged installs of the Kubernetes integration

For platforms that have stringent security requirements, we provide an unprivileged version of the Kubernetes integration. Changes from the standard Kubernetes integration are:

  • Runs the infrastructure agent and the Kubernetes integration as a standard user instead of root
  • No access to the underlying host filesystem
  • No access to /var/run/docker.sock
  • Container's root filesystem mounted as read-only
  • allowPrivilegeEscalation is set to false
  • hostnetwork is set to false

The tradeoff is that the solution will only collect metrics from Kubernetes, but it will not collect any metric from the underlying hosts directly. Kubernetes provides some data (metrics and metadata) about its nodes (hosts).

Optional: To collect the underlying host metrics, the non-containerized infrastructure agent can be deployed on the underlying host. The infrastructure agent already supports running as non-root. The combination of the Kubernetes integration in its unprivileged version and the agent running on the host will report all the metrics that our standard solution for monitoring Kubernetes receives.

Steps to complete an unprivileged install
  1. Install kube-state-metrics and get it running on the cluster. For example:

    curl -L -o kube-state-metrics-1.9.5.zip https://github.com/kubernetes/kube-state-metrics/archive/v1.9.5.zip && unzip kube-state-metrics-1.9.5.zip && kubectl apply -f kube-state-metrics-1.9.5/examples/standard
  2. Download the integration manifest:

    curl -O https://download.newrelic.com/infrastructure_agent/integrations/kubernetes/newrelic-infrastructure-k8s-unprivileged-latest.yaml
        
  3. In the manifest, add your New Relic license key and a cluster name to identify your Kubernetes cluster. Both values are required.

    YOUR_CLUSTER_NAME is your cluster’s id in New Relic’s entity explorer. It doesn’t need to match the name of the cluster running in your environment.

    env:
      - name: NRIA_LICENSE_KEY
        value: YOUR_LICENSE_KEY
      - name: CLUSTER_NAME
        value: YOUR_CLUSTER_NAME
  4. Confirm that kube-state-metrics is installed.

    kubectl get pods --all-namespaces | grep kube-state-metrics
  5. Create the DaemonSet:

    kubectl create -f newrelic-infrastructure-k8s-unprivileged-latest.yaml
  6. Confirm that the DaemonSet has been created successfully by looking for newrelic-infra in the results generated by this command:

    kubectl get daemonsets
  7. To confirm that the integration has been configured correctly, wait a few minutes, then run this NRQL query to see if data has been reported:

    SELECT * FROM K8sPodSample since 5 minutes ago
        

Configure the integration

The Kubernetes integration comes with a default configuration that should work in most environments. To change the configuration, modify the manifest file:

Select which processes should send their data to New Relic

By default, data about the processes running on your pods is not sent to New Relic. You can enable it by setting enable_process_metrics to true.

To choose what metric data you want to send to New Relic, configure the include_matching_metrics environment variable in your manifest.

Specify the Kubernetes API host and port

This is necessary when you are using SSL and not using the default FQDN. The Kubernetes API FQDN needs to match the FQDN of the SSL certificate.

You do not need to specify both variables. For example, if you only specify the HOST, the default PORT will be used.

- name: "KUBERNETES_SERVICE_HOST" 
  value: "KUBERNETES_API_HOST"
- name: "KUBERNETES_SERVICE_PORT" 
  value: "KUBERNETES_API_TCP_PORT"
  
Kubernetes versions 1.6 to 1.7.5: Edit manifest file

For Kubernetes versions 1.6 to 1.7.5, uncomment these two lines in the manifest file:

- name: "CADVISOR_PORT" # Enable direct connection to cAdvisor by specifying the port.  Needed for Kubernetes versions prior to 1.7.6.
  value: "4194"
  
Use environment variables

Use environment variables that can be passed to the Kubernetes integration if you use a proxy to configure its URL.

Disable kube-state-metrics parsing

You can disable kube-state-metrics parsing for the DaemonSet by using the following configuration:

 - name: "DISABLE_KUBE_STATE_METRICS"
   value: "true"

Disabling kube-state-metrics also disables data collection for the following:

  • ReplicaSets
  • DaemonSets
  • StatefulSets
  • Namespaces
  • Deployments
  • Services
  • Endpoints
  • Pods (that are pending)

Additionally, disabling this affects the Kubernetes Cluster Explorer in the following ways:

  • No pending pods are shown.
  • No filters based on services.
Specify the kube-state-metrics URL

If several instances of kube-state-metrics are present in the cluster, uncomment and configure the following lines to specify which one to use:

 - name: "KUBE_STATE_METRICS_URL"
   value: "http://KUBE_STATE_METRICS_IP_OR_FQDN:PORT"

Even though a KUBE_STATE_METRICS_URL is defined, the KSM service should contain one of the following labels for the auto-discovery process:

  • k8s-app=kube-state-metrics

    OR

  • app=kube-state-metrics

This configuration option overrides KUBE_STATE_METRICS_POD_LABEL. If you have both defined, KUBE_STATE_METRICS_POD_LABEL has no effect.

Discover kube-state-metrics pods using a label

If several instances of kube-state-metrics are present in the cluster, another option to easily target one of these instances with the Kubernetes integration is to use label-based discovery.

 - name: "KUBE_STATE_METRICS_POD_LABEL"
   value: "LABEL_NAME"

When a KUBE_STATE_METRICS_POD_LABEL is defined, the label should have a value equal to true. For example, if the label name is my-ksm, ensure that my-ksm=true.

This configuration option is incompatible with KUBE_STATE_METRICS_URL. If you have both defined, KUBE_STATE_METRICS_URL is used.

Query kube-state-metrics behind RBAC

If your instance of kube-state-metrics is behind kube-rbac-proxy, the integration can be configured in a compatible way using the combination of the label-based discovery and two other environment variables:

 - name: "KUBE_STATE_METRICS_SCHEME"
   value: "https"
 - name: "KUBE_STATE_METRICS_PORT"
   value: "KSM_RBAC_PROXY_PORT"

To confirm which port should be used as the value of KUBE_STATE_METRICS_PORT, we recommend running a describe command on the kube-state-metrics pod and look for the port exposed by the container named kube-rbac-proxy-main.

These two configuration options only work when using the KUBE_STATE_METRICS_POD_LABEL ​configuration described above.

kube-state-metrics timeout: Increase the client timeout

To increase the client timeout of kube-state-metrics, add a new environment variable, TIMEOUT, to the manifest file:

env:
  - name: TIMEOUT
    value: 5000 # The default client timeout when calling kube-state-metrics, in milliseconds

Then, add this new environment variable to the NRIA_PASSTHROUGH_ENVIRONMENT

Non-default namespace deployments: Edit config file

If you want to deploy in a different namespace from default, change all values of namespace in the manifest.

Set the TTL for the Kubernetes API responses cache

By default, the integration will cache any retrieved information from the Kubernetes API for 5 minutes.

Use the API_SERVER_CACHE_TTL environment variable to set a custom cache duration for responses from the API server. Valid time unit values are: ns, us, ms, s, m, and h. To disable caching, set to 0s.

env: 
  - name: API_SERVER_CACHE_TTL 
    value: "1m"

Specify base URLs for control plane component endpoints

Use the following environment variables if any of the Kubernetes control plane components export metrics on base URLs that are different from the defaults. This is necessary for environments such as OpenShift when a control plane component metrics endpoint is using SSL or an alternate port.

Values of these environment variables must be base URLs of the form [scheme]://[host]:[port]. URLs should not include a path component. For example:

- name: "SCHEDULER_ENDPOINT_URL"
  value: "https://localhost:10259
- name: "ETCD_ENDPOINT_URL"
  value: "https://localhost:9979"
- name: "CONTROLLER_MANAGER_ENDPOINT_URL"
  value: "https://localhost:10257"
- name: "API_SERVER_ENDPOINT_URL"
  value: "https://localhost:6443"

The /metrics path segment is added automatically. In addition, if the https scheme is used, authentication to the control plane component pod(s) is accomplished via service accounts.

If a FQDN (fully qualified domain name) is used in a multi-master cluster, inconsistent results may be returned. Therefore, it is recommended to use localhost only.

Even though a custom base URL is defined for a given control plane component, the control plane component pod(s) must contain one of the labels supported by the auto-discovery process.

Even though a custom ETCD_ENDPOINT_URL can be defined, ETCD will always require https and mTLS authentication to be configured.

Here are some additional configurations to consider:

Update to the latest version

To improve our unified experience, starting from Wednesday, August 12, 2020, Kubernetes integrations that use v1.7 or older will be deprecated. If you are using v1.7 or older, you will need to update your integration in order to continue viewing Kubernetes performance data. For more information, see the Kubernetes v1.7 or older deprecation notice.

If you are already running the Kubernetes integration and want to update the newrelic-infra agent to the latest agent version:

  1. Run this NRQL query to check which version you are currently running (this will return the image name by cluster):

    SELECT latest(containerImage)  FROM K8sContainerSample 
    WHERE containerImage LIKE '%newrelic/infrastructure%' FACET clusterName SINCE 1 day ago

    If you've set a name other than newrelic/infrastructure for the integration's container image, the above query won't yield results: to make it work, edit the name in the query.

  2. Download the integration manifest file:

    curl -O https://download.newrelic.com/infrastructure_agent/integrations/kubernetes/newrelic-infrastructure-k8s-latest.yaml
    
  3. Copy the changes you made to the manifest. At a minimum, include CLUSTER_NAME and NRIA_LICENSE_KEY, and paste your changes in the manifest you downloaded.

  4. Install the latest DaemonSet with the following command (Kubernetes will automatically do a rollout upgrade for the integration's pods):

    kubectl apply -f newrelic-infrastructure-k8s-latest.yaml
    

Uninstall the Kubernetes integration

To uninstall the Kubernetes integration:

  1. Verify that newrelic-infrastructure-k8s-latest.yaml corresponds to the filename of the manifest as you have saved it.

    Example: If you are using the unprivileged version of the integration, the default filename will be newrelic-infrastructure-k8s-unprivileged-latest.yaml.

  2. After you verify the filename, use the following command:

    kubectl delete -f newrelic-infrastructure-k8s-latest.yaml

You only need to execute this command once, regardless of the number of nodes in your cluster.

For more help

If you need more help, check out these support and learning resources: