Reduce the Infrastructure agent's CPU footprint

Problem

The New Relic Infrastructure agent is consuming too much CPU.

"Too much" is a relative assertion. For example, this could be "3% CPU in a single-core cloud instance with few processes" or "20% CPU in a 4-core machine with 2,000 running processes and 50 containers."

Solution

The New Relic Infrastructure Agent is designed to report a broad range of system data with minimal CPU and memory consumption.

However, in some systems, the CPU consumption may be relatively high if the amount of data to be reported from some plugins is large.

To reduce the footprint, you can disable or decrease the sampling frequency of some plugins that report data that may not be interesting for you. The rest of this section explains which configuration properties can be added to the newrelic-infra.yml file to tune up the Infrastructure Agent.

Reduce sampling frequency or disable samplers

You can reduce the default sampling frequency of the following samplers by setting the desired sample rate, in seconds:

  • System Sampler: metrics_system_sample_rate (default value: 5)
  • Storage Sampler: metrics_storage_sample_rate (default value: 20)
  • Network Sampler: metrics_network_sample_rate (default value: 10)
  • Process Sampler: metrics_process_sample_rate (default value: 20)

It is not recommended that the sample rate is larger than 60 seconds, because you may see gaps in the New Relic user interface charts (as if the system was down, without reporting data).

If you want to completely disable the samplers, you can set -1 to the corresponding property value.

Disable SELinux semodule -l (Linux only)

The SELinux plugin periodically invokes the semodule -l system command to get information about the existing SELinux modules. In most CentOS/RedHat distributions, this command will generate CPU consumption peaks.

If you want to make sure that this functionality is disabled, make sure the following configuration option is in your /etc/newrelic-infra.yml file:

selinux_enable_semodule: false

Reducing the Sysctl inventory frequency, or disabling it (Linux only)

The Sysctl plugin walks the whole /sys directory structure and reads values from all the files there. Disabling it may decrease some CPU System time in the Infrastructure Agent, setting the sysctl_interval_sec configuration value to the number of seconds between consecutive executions of the plugin, or a negative number to disable it.

For example, to execute the plugin once every 10 minutes:

sysctl_interval_sec: 600

To disable the Sysctl plugin:

sysctl_interval_sec: -1

Current default value for the sysctl_interval_sec property is 60.

Reduce inventories frequency or disable them

The following inventory plugins are not especially CPU consuming, but you can anyway reduce their frequency or disable them by setting the corresponding configuration options.

The values of the following properties are expressed in seconds. Setting a negative number means disabling the plugin.

Linux hosts inventory

  • Cloud Security Groups: cloud_security_group_refresh_sec
    • Only enabled in Linux hosts that run in Amazon Cloud.
    • Default value: 60
  • Daemon Tools: daemontools_interval_sec
    • Only enabled in Linux hosts with Daemon tools installed.
    • Default value: 15
  • DPKG: dpkg_interval_sec
    • Only enabled in Linux hosts with DPKG package manager (e.g. Debian, Ubuntu).
    • Default value: 30
  • Facter: facter_interval_sec
    • Only enabled in Linux hosts with Puppet's Facter installed.
    • Default value: 30
  • Kernel Modules: kernel_modules_refresh_sec
    • Default value: 10
  • Network interfaces: network_interface_interval_sec
    • Default value: 60
  • RPM: rpm_interval_sec
    • Only enabled in Linux hosts with RPM package manager (e.g. Suse, CentOS, RedHat, Amazon Linux).
    • Default value: 30
  • SELinux: selinux_interval_sec
    • Only enabled in Linux hosts with SELinux.
    • Default value: 30
  • Supervisord: supervisor_interval_sec
    • Only enabled in Linux hosts with Supervisord.
    • Default value: 15
  • Sysctl: sysctl_interval_sec
    • Default value: 60
  • Systemd: systemd_interval_sec
    • Only enabled in Linux hosts with SystemD init system.
    • Default value: 30
  • SysV: sysvinit_interval_sec
    • Only enabled in Linux hosts with SysV init system.
    • Default value: 30
  • Upstart: upstart_interval_sec
    • Only enabled in Linux hosts with Upstart init system.
    • Default value: 30
  • Users: users_refresh_sec
    • Default value: 15
  • SSHD configuration: sshd_config_refresh_sec
    • Default value: 15

Windows hosts inventory

  • Network interfaces: network_interface_interval_sec
    • Default value: 60
  • Windows services: windows_services_refresh_sec
    • Default value: 30
  • Windows updates: windows_updates_refresh_sec
    • Default value: 60

Disabling all the inventories

If you want to disable all the inventories, you can directly set the following configuration option:

disable_all_plugins: true

If you want to selectively enable some inventories, you can explicitly set values to the properties from the previous sections. For example, the following configuration will disable all the inventories but the Network Interfaces plugin will report every 120 seconds:

disable_all_plugins: true
network_interface_interval_sec: 120

For more help

Recommendations for learning more: