CPU usage mismatch or usage over 100%

Problem

CPU percent usage data between New Relic APM and New Relic Infrastructure seems contradictory, or your CPU usage exceeds 100%.

Cause

CPU usage data for New Relic APM is different than CPU usage data for New Relic Infrastructure.

  • In APM, CPU percent usage measures aggregate usage of all instances of an application or service on a given server.
  • In Infrastructure, CPU percent usage is measured differently depending on if you are viewing CPU usage in relation to hosts, or to processes.
Example of CPU usage mismatch

This example illustrates a discrepancy between CPU percent usage by New Relic APM and New Relic Infrastructure:

Infra-vs-APM-CPU-usage.png
rpm.newrelic.com/apm > (select an app) and infrastructure.newrelic.com > Hosts: The CPU metrics in New Relic APM and New Relic Infrastructure have different meanings.
New Relic APM

To view your app's CPU usage: Go to rpm.newrelic.com/apm > (select an app) > Monitoring > Overview. The number of instances is listed under each host name (for example, 55 instances).

APM CPU percent usage measures aggregate CPU usage of all instances of your app or service on a given server. This percentage is affected by the number of instances running across cores on the server. Multiple instances of a service running on one server or in a multi-core environment can produce CPU usage percentages well above 100%.

APM calculates the percentage by aggregating CPU time and dividing by clock time:

CPU usage = (instance CPU time + instance CPU time + [...]) / (clock time)

Example: Upgrading to a quad core processor

If you upgrade from a dual processor to a quad processor under the same architecture, you should see roughly the same CPU numbers for the same loads and applications. If New Relic normalized the calculation, the upgrade would appear to produce an abrupt decrease in your CPU usage, even if the number of cycles you are using would be the same. Adding more instances does not make your code more efficient.

New Relic Infrastructure

To view your individual host CPU usage:

  1. Go to infrastructure.newrelic.com.
  2. Optional: To sort the New Relic Infrastructure index by highest or lowest CPU usage, select the CPU column heading.
  3. To drill down into detailed CPU usage information, select the individual host.

The exact calculation for CPU usage is measured differently depending on if you are viewing CPU usage in relation to hosts, or to processes:

Page Calculation details
Hosts

On the Hosts page, CPU Percent is a derived metric that is part of the SystemSample event. The CPU percentage is not collected by New Relic, but derived from several other metrics. Specifically, the cpuPercent attribute is an aggregation of cpuUserPercent, cpuSystemPercent, cpuIoWaitPercent and cpuStealPercent.

The Hosts page shows metrics from a perspective of your whole system. For example, if a server has 32 cores installed and is showing cpuPercent of 50%, 16 of the 32 cores are being used.

Processes

On the Processes page, CPU percent is scoped to individual processes, rather than hosts. Because of this, the CPU percent metric does not take into account the resources of the entire system. Instead, it shows how much of a single CPU core each process is taking.

For example, with a 32-core server, a process showing cpuPercent of 50% is only taking half of a single core, or about 1.5625% of the CPU cycles available on the whole server. This can result in cpuPercent exceeding 100% on the Processes page. When this occurs, the process is demanding more than a single core’s worth of CPU cycles.

For more help

Recommendations for learning more: