Kubernetes integration: Understand and use data

This document explains how to find and use data reported from the New Relic Kubernetes integration.

Find and use data

To view the Kubernetes integration's dashboard:

  1. Go to infrastructure.newrelic.com > Kubernetes.
  2. Select the Kubernetes dashboard link to open the Kubernetes dashboard.
  3. To create your own dashboards, go to insights.newrelic.com and create NRQL queries.

Kubernetes data is attached to the following event types.

Event name Type of Kubernetes data Available since
K8sNodeSample Node data v1.0.0
K8sNamespaceSample Namespace data v1.0.0
K8sDeploymentSample Deployment data v1.0.0
K8sReplicasetSample ReplicaSet data v1.0.0
K8sDaemonsetSample DaemonSet data v1.13.0
K8sStatefulsetSample StatefulSet data v1.13.0
K8sPodSample Pod data v1.0.0
K8sClusterSample Cluster data v1.0.0
K8sContainerSample Container data v1.0.0
K8sVolumeSample Volume data v1.0.0
K8sApiServerSample API server data v1.11.0
K8sControllerManagerSample Controller manager data v1.11.0
K8sSchedulerSample Scheduler data v1.11.0
K8sEtcdSample ETCD data v1.11.0
K8sEndpointSample Endpoint data v1.13.0
K8sServiceSample Service data v1.13.0

Manage alerts

You can be notified about alert violations for your Kubernetes data:

Create an alert condition

To create an alert condition for the Kubernetes integration:

  1. Go to infrastructure.newrelic.com > Settings > Alerts > Kubernetes, then select Create alert condition.
  2. To filter the alert to Kubernetes entities that only have the chosen attributes, select Filter.
  3. Select the threshold settings. For more on the Trigger an alert when... options, see Alert types.
  4. Select an existing alert policy, or create a new one.
  5. Select Create.

When an alert condition's threshold is triggered, New Relic sends a notification to the policy's notification channels.

Kubernetes integration alert condition
infrastructure.newrelic.com > Settings > Alerts > Kubernetes > Create alert condition: Infrastructure includes alert conditions specific to Kubernetes.
Use alert types and thresholds

To use any of the available Kubernetes-specific alert criteria, select the Kubernetes alert type:

Kubernetes alert types Comments
Available pods are less than desired pods

This alert type monitors ReplicaSets. The alert triggers if the number of available replicas (pods) for a deployment is less than the number of replicas you chose when creating the deployment. This can happen if there are not enough resources in your cluster to schedule all pods for a deployment. New Relic applies the alert conditions individually to each deployment matching the specified filter.

Container CPU usage

This alert type compares the CPU consumption of a container with the limit that you defined when it was created. The alert triggers if usage exceeds the threshold. Container CPU usage is defined as:

(CPU cores used / CPU cores limit) * 100
Container memory usage

This alert type compares the memory consumption of a container with the limit that was defined when it was created. The alert triggers if usage exceeds the threshold. Container memory usage is defined as:

(memory used / memory limit) * 100

In addition, you can create an alert condition for any metric collected by any New Relic integration you use, including the Kubernetes integration:

  1. Select the alert type Integrations.
  2. From the Select a data source dropdown, select a Kubernetes (K8s) data source.
Select alert notifications

When an alert condition's threshold is triggered, New Relic sends a message to the notification channel(s) chosen in the alert policy. Depending on the type of notification, you may have the following options:

The entity identifier that triggered the alert appears near the top of the notification message. The format of the identifier depends on the alert type:

  • Available pods are less than desired pods alerts:

    K8s:CLUSTER_NAME:PARENT_NAMESPACE:replicaset:REPLICASET_NAME
  • CPU or memory usage alerts:

    K8s:CLUSTER_NAME:PARENT_NAMESPACE:POD_NAME:container:CONTAINER_NAME

Here are some examples.

Pod alert notification example

For Available pods are less than desired pods alerts, the ID of the ReplicaSet triggering the issue might look like this:

k8s:beam-production:default:replicaset:nginx-deployment-1623441481

This identifier contains the following information:

  • Cluster name: beam-production
  • Parent namespace: default
  • ReplicaSet name: nginx-deployment-1623441481
Container resource notification example

For container CPU or memory usage alerts, the entity might look like this:

k8s:beam-production:kube-system:kube-state-metrics-797bb87c75-zncwn:container:kube-state-metrics

This identifier contains the following information:

  • Cluster name: beam-production
  • Parent namespace: kube-system
  • Pod namespace: kube-state-metrics-797bb87c75-zncwn
  • Container name: kube-state-metrics
Create alert conditions using NRQL

Follow standard procedures to create alert conditions for NRQL queries.

Kubernetes attributes and metrics

The Kubernetes integration collects the following metrics and other attributes. For more on using integration data, see Find and use data.

Node data

Query the K8sNodeSample event in New Relic Insights for node data:

Node attribute Description
clusterName Name that you assigned to the cluster when you installed the Kubernetes integration.
cpuUsedCoreMilliseconds Node CPU usage measured in core milliseconds.
cpuUsedCores Node CPU usage measured in cores.
fsAvailableBytes Bytes available in the node filesystem.
fsCapacityBytes Total capacity of the node filesystem in bytes.
fsInodes Total number of inodes in the node filesystem.
fsInodesFree Free inodes in the node filesystem.
fsInodesUsed Used inodes in the node filesystem.
fsUsedBytes Used bytes in the node filesystem.
memoryAvailableBytes Bytes of memory available in the node.
memoryMajorPageFaultsPerSecond Number of major page faults per second in the node.
memoryRssBytes Bytes of rss memory.
memoryUsedBytes Bytes of memory used.
memoryWorkingSetBytes Bytes of memory in the working set.
net.errorCountPerSecond Number of errors per second while receiving/transmitting over the network.
nodeName Host name that the pod is running on.
runtimeAvailableBytes Bytes available to the container runtime filesystem.
runtimeCapacityBytes Total capacity assigned to the container runtime filesystem in bytes.
runtimeInodes Total number of inodes in the container runtime filesystem.
runtimeInodesFree Free inodes in the container runtime filesystem.
runtimeInodesUsed Used inodes in the container runtime filesystem.
runtimeUsedBytes Used bytes in the container runtime filesystem.
label.LABEL_NAME Labels associated with your node, so you can filter and query for specific nodes.

Namespace data

Query the K8sNamespaceSample event in New Relic Insights for namespace data:

Namespace attribute Description
clusterName Name that you assigned to the cluster when you installed the Kubernetes integration.
createdAt Timestamp of the namespace when it was created.
namespace Name of the namespace to be used as an identifier.
label.LABEL_NAME Labels associated with your namespace, so you can filter and query for specific namespaces.
status

Current status of the namespace.

The value can be Active or Terminated.

Deployment data

Query the K8sDeploymentSample event in New Relic Insights for deployment data:

Deployment attribute Description
clusterName Name that you assigned to the cluster when you installed the Kubernetes integration.
createdAt Timestamp of when the deployment was created.
deploymentName Name of the deployment to be used as an identifier.
namespace Name of the namespace that the deployment belongs to.
label.LABEL_NAME Labels associated with your deployment, so you can filter and query for specific deployments.
podsAvailable Number of replicas that are currently available.
podsDesired Number of replicas that you defined in the deployment.
podsTotal Total number of replicas that are currently running.
podsUnavailable Number of replicas that are currently unavailable.
podsUpdated Number of replicas that have been updated to achieve the desired state of the deployment.
updatedAt Timestamp of when the deployment was updated.

ReplicaSet data

Query the K8sReplicasetSample event in New Relic Insights for ReplicaSet data:

Replica attribute Description
clusterName Name that you assigned to the cluster when you installed the Kubernetes integration.
createdAt Timestamp of when the ReplicaSet was created.
deploymentName Name of the deployment to be used as an identifier.
namespace Name of the namespace that the ReplicaSet belongs to.
observedGeneration Integer representing generation observed by the ReplicaSet.
podsDesired Number of replicas that you defined in the deployment.
podsFullyLabeled Number of pods that have labels that match the ReplicaSet pod template labels.
podsReady Number of replicas that are ready for this ReplicaSet.
podsTotal Total number of replicas that are currently running.
replicasetName Name of the ReplicaSet to be used as an identifier.

DaemonSet data

Query the K8sDaemonsetSample event in New Relic Insights for DaemonSet data:

DaemonSet attribute Description
clusterName Name that you assigned to the cluster when you installed the Kubernetes integration.
createdAt Timestamp of when the DaemonSet was created.
namespaceName Name of the namespace that the DaemonSet belongs to.
label.LABEL_NAME Labels associated with your DaemonSet, so you can filter and query for specific DaemonSet.
daemonsetName Name associated with the DaemonSet.
podsDesired The number of nodes that should be running the daemon pod.
podsScheduled The number of nodes running at least one daemon pod and are supposed to.
podsAvailable The number of nodes that should be running the daemon pod and have one or more of the daemon pod running and available.
podsReady The number of nodes that should be running the daemon pod and have one or more of the daemon pod running and ready.
podsUnavailable The number of nodes that should be running the daemon pod and have none of the daemon pod running and available.
podsMisscheduled The number of nodes running a daemon pod but are not supposed to.
podsUpdatedScheduled The total number of nodes that are running updated daemon pod.
metadataGeneration Sequence number representing a specific generation of the desired state.

StatefulSet data

Query the K8sStatefulsetSample event in New Relic Insights for StatefulSet data:

StatefulSet attribute Description
clusterName Name that you assigned to the cluster when you installed the Kubernetes integration.
createdAt Timestamp of when the StatefulSet was created.
namespaceName Name of the namespace that the StatefulSet belongs to.
label.LABEL_NAME Labels associated with your StatefulSet, so you can filter and query for specific StatefulSet.
statefulsetName Name associated with the StatefulSet.
podsDesired Number of desired pods for a StatefulSet.
podsReady The number of ready replicas per StatefulSet.
podsCurrent The number of current replicas per StatefulSet.
podsTotal The number of replicas per StatefulSet.
podsUpdated The number of updated replicas per StatefulSet.
observedGeneration The generation observed by the StatefulSet controller.
metadataGeneration Sequence number representing a specific generation of the desired state for the StatefulSet.
currentRevision Indicates the version of the StatefulSet used to generate Pods in the sequence. Value range: between 0 and podsCurrent.
updateRevision Indicates the version of the StatefulSet used to generate Pods in the sequence. Value range: between podsDesired-podsUpdated and podsDesired.

Pod data

Query the K8sPodSample event in New Relic Insights for pod data:

Pod attribute Description
clusterName Name that you assigned to the cluster when you installed the Kubernetes integration.
createdAt Timestamp of when the pod was created.
createdBy Name of the Kubernetes object that created the pod.

For example, newrelic-infra.

createdKind Kind of Kubernetes object that created the pod.

For example, DaemonSet.

deploymentName Name of the deployment to be used as an identifier.
isReady Boolean representing whether or not the pod is ready to serve requests.
isScheduled Boolean representing whether or not the pod has been scheduled to run on a node.
label.LABEL_NAME Labels associated with your pod, so you can filter and query for specific pods.
message Details related to the last pod status change.
namespace Name of the namespace that the pod belongs to.
net.errorCountPerSecond Number of errors per second while receiving/transmitting over the network.
net.rxBytesPerSecond Number of bytes per second received over the network.
net.txBytesPerSecond Number of bytes per second transmitted over the network.
nodeIP Host IP address that the pod is running on.
nodeName Host name that the pod is running on.
podName Name of the pod to be used as an identifier.
reason Reason why the pod is in the current status.
startTime Timestamp of when the pod started running.
status Current status of the pod.

The value can be Pending, Running, Succeeded, Failed, Unknown.

Cluster data

Query the K8sClusterSample event in New Relic Insights to see cluster data:

Cluster attribute Description
clusterName Name that you assigned to the cluster when you installed the Kubernetes integration.

Container data

Query the K8sContainerSample event in New Relic Insights for container data:

Container attribute Description
clusterName Name that you assigned to the cluster when you installed the Kubernetes integration.
containerID Unique ID associated with the container. If you are running Docker, this is the Docker container id.
containerImage Name of the image that the container is running.
containerImageID Unique ID associated with the image that the container is running.
containerName Name associated with the container.
cpuLimitCores Integer representing limit CPU cores defined for the container in the pod specification.
cpuRequestedCores Requested CPU cores defined for the container in the pod specification.
cpuUsedCores CPU cores actually used by the container.
deploymentName Name of the deployment to be used as an identifier.
isReady Boolean. Whether or not the container's readiness check succeeded.
label.LABEL_NAME Labels associated with your container, so you can filter and query for specific containers.
memoryLimitBytes Integer representing limit bytes of memory defined for the container in the pod specification.
memoryRequestedBytes Integer. Requested bytes of memory defined for the container in the pod specification.
memoryUsedBytes Integer. Bytes of memory actually used by the container.
memoryWorkingSetBytes Integer. Bytes of memory in the working set.
namespace Name of the namespace that the container belongs to.
nodeIP Host IP address the container is running on.
nodeName Host name that the container is running on.
podName Name of the pod that the container is in, to be used as an identifier.
reason Provides a reason why the container is in the current status.
restartCount Number of times the container has been restarted.
status Current status of the container.

The value can be Running, Terminated, or Unknown.

Volume data

Query the K8sVolumeSample event in New Relic Insights for volume data:

Volume attribute Description
volumeName Name that you assigned to the volume at creation.
clusterName Cluster where the volume is configured.
namespace Namespace where the volume is configured.
podName The pod that the volume is attached to. The Kubernetes monitoring integration lists Volumes that are attached to a pod.
persistent If this is a persistent volume, this value is set to "true".
pvcNamespace Namespace where the Persistent Volume Claim is configured.
pvcName Name that you assigned to the Persistent Volume Claim at creation.
fsCapacityBytes Capacity of the volume, in bytes.
fsUsedBytes Usage of the volume, in bytes.
fsAvailableBytes Capacity available of the volume, in bytes.
fsUsedPercent Usage of the volume in percentage.
fsInodes Total Inodes of the volume.
fsInodesUsed Inodes used in the volume.
fsInodesFree Inodes available in the volume.

Volume data is available for volume plugins that implement the MetricsProvider interface:

  • AWSElasticBlockStore
  • AzureDisk
  • AzureFile
  • Cinder
  • Flexvolume
  • Flocker
  • GCEPersistentDisk
  • GlusterFS
  • iSCSI
  • StorageOS
  • VsphereVolume

API server data

Query the K8sApiServerSample event in New Relic Insights to see API Server data. For more information, see Configure control plane monitoring:

API server attribute Description
processResidentMemoryBytes

Resident memory size, in bytes.

processCpuSecondsDelta Difference of the user and system CPU time spent, in seconds.
goThreads Number of OS threads created.
goGoroutines Number of goroutines that currently exist.
apiserverRequestDelta_verb_VERB_code_CODE Difference of the number of apiserver requests, broken out for each verb and HTTP response code.
apiserverRequestRate_verb_VERB_code_CODE Rate of apiserver requests, broken out for each verb and HTTP response code.
restClientRequestsDelta_code_CODE_method_METHOD Difference of the number of HTTP requests, partitioned by method and code.
restClientRequestsRate_code_CODE_method_METHOD Rate of the number of HTTP requests, partitioned by method and code.
etcdObjectCounts_resource_RESOURCE-KIND Number of stored objects at the time of last check, split by kind.

Controller manager data

Query the K8sControllerManagerSample event in New Relic Insights to see Controller manager data. For more information, see Configure control plane monitoring:

Controller manager attribute Description
processResidentMemoryBytes

Resident memory size, in bytes.

processCpuSecondsDelta Difference of the user and system CPU time spent in seconds.
goThreads Number of OS threads created.
goGoroutines Number of goroutines that currently exist.
workqueueAddsDelta_name_WORK-QUEUE-NAME Difference of the total number of adds handled by workqueue.
workqueueDepth_name_WORK-QUEUE-NAME Current depth of workqueue.
workqueueRetriesDelta_name_WORK-QUEUE-NAME Difference of the total number of retries handled by workqueue.
leaderElectionMasterStatus Gauge of if the reporting system is master of the relevant lease, 0 indicates backup, 1 indicates master.

Scheduler data

Query the K8sSchedulerSample event in New Relic Insights to see Scheduler data. For more information, see Configure control plane monitoring:

Scheduler attribute Description
processResidentMemoryBytes

Resident memory size, in bytes.

processCpuSecondsDelta Difference of the user and system CPU time spent in seconds.
goThreads Number of OS threads created.
goGoroutines Number of goroutines that currently exist.
leaderElectionMasterStatus Gauge of if the reporting system is master of the relevant lease, 0 indicates backup, 1 indicates master.
httpRequestDurationMicroseconds_handler_HANDLER_quantile_QUANTILE The HTTP request latencies in microseconds, per quantile.
httpRequestDurationMicroseconds_handler_HANDLER_sum The sum of the HTTP request latencies, in microseconds.
httpRequestDurationMicroseconds_handler_HANDLER_count The number of observed HTTP requests events.
restClientRequestsDelta_code_CODE_host_HOST_method_METHOD Difference of the number of HTTP requests, partitioned by status code, method, and host.
restClientRequestsRate_code_CODE_host_HOST_method_METHOD Rate of the number of HTTP requests, partitioned by status code, method, and host.
schedulerScheduleAttemptsDelta_result_RESULT Difference of the number of attempts to schedule pods, by the result. unschedulable means a pod could not be scheduled, while error means an internal scheduler problem.
schedulerScheduleAttemptsRate_result_RESULT Rate of the number of attempts to schedule pods, by the result. unschedulable means a pod could not be scheduled, while error means an internal scheduler problem.
schedulerSchedulingDurationSeconds_operation_OPERATION_quantile_QUANTILE Scheduling latency in seconds split by sub-parts of the scheduling operation.
schedulerSchedulingDurationSeconds_operation_OPERATION_sum The sum of scheduling latency in seconds split by sub-parts of the scheduling operation
schedulerSchedulingDurationSeconds_operation_OPERATION_count The number of observed events of schedulings split by sub-parts of the scheduling operation.
schedulerPreemptionAttemptsDelta Difference of the total preemption attempts in the cluster till now.
schedulerPodPreemptionVictims Number of selected preemption victims.

ETCD data

Query the K8sEtcdSample event in New Relic Insights to see ETCD data. For more information, see Configure control plane monitoring:

ETCD attribute Description
processResidentMemoryBytes

Resident memory size, in bytes.

processCpuSecondsDelta Difference of the user and system CPU time spent in seconds.
goThreads Number of OS threads created.
goGoroutines Number of goroutines that currently exist.
etcdServerHasLeader Whether or not a leader exists. 1 is existence, 0 is not.
etcdServerLeaderChangesSeenDelta Difference of the number of leader changes seen.
etcdMvccDbTotalSizeInBytes Total size of the underlying database physically allocated, in bytes.
etcdServerProposalsCommittedDelta Difference of the total number of consensus proposals committed.
etcdServerProposalsCommittedRate Rate of the total number of consensus proposals committed.
etcdServerProposalsAppliedDelta Difference of the total number of consensus proposals applied.
etcdServerProposalsAppliedRate Rate of the total number of consensus proposals applied.
etcdServerProposalsPending The current number of pending proposals to commit.
etcdServerProposalsFailedDelta Difference of the total number of failed proposals seen.
etcdServerProposalsFailedRate Rate of the total number of failed proposals seen.
processOpenFds Number of open file descriptors.
processMaxFds Maximum number of open file descriptors.
etcdNetworkClientGrpcReceivedBytesRate Rate of the total number of bytes received from gRPC clients.
etcdNetworkClientGrpcSentBytesRate Rate of the total number of bytes sent to gRPC clients.

Endpoint data

Query the K8sEndpointSample event in New Relic Insights for endpoint data:

Endpoint attribute Description
clusterName Name that you assigned to the cluster when you installed the Kubernetes integration.
createdAt Timestamp of when the endpoint was created.
namespaceName Name of the namespace that the endpoint belongs to.
endpointName Name associated with the endpoint.
label.LABEL_NAME Labels associated with your endpoint, so you can filter and query for specific endpoints.
addressAvailable Number of addresses available in endpoint.
addressNotReady Number of addresses not ready in endpoint.

Service data

Query the K8sServiceSample event in New Relic Insights for service data:

Service attribute Description
clusterName Name that you assigned to the cluster when you installed the Kubernetes integration.
createdAt Timestamp of when the service was created.
namespaceName Name of the namespace that the service belongs to.
label.LABEL_NAME Labels associated with your service, so you can filter and query for specific service.
serviceName Name associated with the service.
loadBalancerIP The IP of the external loadbalancer, if Spectype is LoadBalancer.
externalName The external name value, if Spectype is ExternalName.
clusterIP The internal cluster IP, if Spectype is ClusterIP.
specType Type of the service.
selector.LABEL_NAME The label selector that this service targets.

Kubernetes metadata in APM-monitored applications

By linking your applications with Kubernetes, the following attributes are added to application trace and distributed trace:

  • nodeName
  • containerName
  • podName
  • clusterName
  • deploymentName
  • namespaceName

For more help

Recommendations for learning more: