Infrastructure alerting examples

New Relic Infrastructure continues to add alerting options to the UI to accommodate your business needs. The Alert type field in New Relic Infrastructure's Settings > Alerts page shows what options you can select to create Infrastructure alert conditions. You can also create alert conditions from any Infrastructure chart by selecting the ellipses [ellipses icon] icon and then the bell [bell icon] icon.

Examples: Infrastructure pages

Here are some examples of how to create alert conditions within the context of the Infrastructure UI page you are currently viewing. To create an alerts condition from any chart, select the ellipses [ellipses icon] icon and then the bell [bell icon] icon. Infrastructure will automatically select the appropriate Alert type.

Example Problem and solution
High CPU usage

Problem:

Your Ops team monitors a filtered set of host clusters in your eastern region and notices that the CPU usage is constantly high.

New Relic Infrastructure alerts tile: System metrics

Solution:

Use the CPU chart on Infrastructure's Hosts page to create an alert condition for system metrics.

Virtual memory capacity

Problem:

Your night shift needs to be alerted when virtual memory for a set of background workers reaches an average of 10G for at least two minutes.

New Relic Infrastructure alerts tile: Process metrics

Solution:

Use the Top memory consumers chart on Infrastructure's Processes page to create an alert condition for process metrics.

Limited bandwidth

Problem:

You want to monitor performance based on the average number of errors received or transmitted.

New Relic Infrastructure alerts tile: Network metrics

Solution:

Use the Top bandwidth chart on Infrastructure's Network page to create an alert condition for network metrics.

I/O read and write operations

Problem:

You are testing a new set of hosts in your staging environment, and you want to be notified when their read or write capacity rises above your test threshold level.

New Relic Infrastructure alerts tile: Storage metrics

Solution:

Use the Top I/O operations chart on Infrastructure's Storage page to create an alert condition for storage metrics.

Host not reporting

Problem:

You want to be notified when New Relic has stopped receiving data from an Infrastructure agent.

New Relic Infrastructure alerts tile: Host not reporting

Solution:

From the Hosts, Processes, Network, or Storage pages, create a host not reporting alerting condition.

Processes not running as expected

Problem:

  • You want to be notified if any of the processes on your hosts stop reporting to New Relic.

    OR

  • A process you expected to start on a host (such as a new program) is not actually running.

New Relic Infrastructure alerts tile: Process running

Solution:

From the Processes page (or from the Hosts, Network, or Storage pages), create a process running alerting condition.

Examples: Threshold options

Use the thresholds dropdown for the selected Alert type to further define how you want to be alerted. Here are some examples of the options available.

Integrations providers

With New Relic Infrastructure Integrations, you can create an alert condition from your Integrations page. Depending on the type of provider selected (CloudFront, DynamoDB, EBS, etc.), options will vary from the Define thresholds dropdown; for example, bytes, errors, requests, CPU, connections, memory, records, latency, etc.

Here is an example of the provider thresholds you can select with a CloudFront integration:

New Relic Infrastructure Integrations: Thresholds example

CPU, disk, load average, memory, swap

The System metrics thresholds dropdown allows you to select various criteria for CPU, disk, load average, memory, and swap metrics.

New Relic Infrastructure: System metric thresholds

Byte size

The Network metrics thresholds provide flexibility with your business needs. Depending on the size of your network, you can easily set the threshold in bytes, KB, MB, GB, or TB.

New Relic Infrastructure alerts: Network byte options

For more help

Recommendations for learning more: