Alerting in New Relic

New Relic provides alert notifications by using alert policies for applications monitored by New Relic APM, servers monitored by New Relic Servers, and key transactions. New Relic also provides downtime alerts for applications and servers. Alert notifications can be set up for all New Relic account types, including Lite.

The End user Apdex < value appears on the Browser Settings dashboard for apps monitored by New Relic Browser. However, the app must also be monitored by New Relic APM in order to set up an alert policy and notifications for it.

Note: You can also receive alert notifications for New Relic Mobile and New Relic Plugins. However, alert policies currently are not supported for these products.

Alert notifications

You can modify alert notifications by defining alert policies or by using the alert API settings in New Relic's REST API. You can have alerts sent to:

  • PagerDuty
  • Campfire
  • HipChat
  • Webhooks
  • Email accounts
  • The New Relic iOS and Android apps

For more information about how to set up these notification types, see the Examples.

Event types

New Relic defines three grades of problem severity:

Icon Description
icon alert open.png  Caution An event has occurred that should be monitored before it becomes critical.
icon incident open.png  Send alerts A critical event requiring attention has occurred.
icon outage.png  Downtime The server is not reporting to New Relic, or the application could not be pinged successfully by New Relic for more than the period of time specified in the alert policy.

Problems and other events (such as deployments) appear in the app's or server's Recent Events list. The associated application or server status indicator also changes color in their respective dashboards.

Color Description
block green.png   Green The server or application is fine.
block yellow.png   Yellow A Caution calls attention to some non-critical issues.
block red.png   Red A Critical alert indicates something is wrong, or that the New Relic agent is unable to communicate with the collector.
block gray.png   Gray No data is being reported for the server at this time.

Conditions

With Alert policies, you can set conditions for Critical and Caution alerts on several metrics.

Metrics are evaluated in real time using a recent time window and moving average. When a monitored metric crosses a threshold condition for set period of time, this creates a problem event and corresponding alert notifications as applicable. For more information, see Minimum throughput for alerts.

You can also select a notification level. This allows you to control how often New Relic sends alert notifications through a specific channel when the alert policy passes its threshold. For more information, see Alert notification levels.

Factors to consider when setting a threshold condition include:

  • Does the alert report above or below the threshold?
  • How long can the threshold be exceeded before reporting?

Recommendation: Before adding a new app, server, or key transaction to an existing alert policy, evaluate whether the policy's alert thresholds are appropriate, or create a new policy. A policy triggers alerts based on conditions after you add an app, server, or key transaction to it.

For example, you are monitoring a new server that is already 90% full, and you add it to a server alert policy that triggers at 85% full. New Relic will not trigger an alert because the new server's metric already is above the threshold. The metric must cross the threshold to trigger an alert, and it did so before monitoring began. In this situation, you may want to create a new alert policy for the new server.

Alert timing

New notifications are created for a policy only when there are no currently open incidents for that policy. Downtime notifications will still be sent if a critical event is open for that policy.

Problem Notification sequence
Problem severity
icon outage.png   Downtime

Downtime alerts are sent after a server has stopped sending data to New Relic or an application has not responded to a pinger for a period of time as defined in the alert policy.

A recovery notice is sent when all downtimes in a policy clear (pinger can reach the apps, or the servers start reporting to New Relic).

A final closing notification is sent when all critical problems have been closed for 5 minutes.

icon incident open.png   Critical events (Send alerts)

Critical events are triggered when a threshold condition has been exceeded for a period of time as defined in the alert policy. When an alert is active, the associated status indicator bar changes to red on your dashboard.

When all Critical events closed, the alert is also considered closed. New Relic sends a final closing notification when the alert has been closed for 5 minutes.

icon alert open.png   Caution Events

Caution events are triggered when a threshold condition has been exceeded for a period of time as defined in the alert policy. The associated status indicator changes to yellow on your dashboard.

Caution events by themselves will not cause alert notifications.

Problem Type
Server problems (CPU, disk, memory)

New Relic sends one alert for an alert policy. If several problems occur simultaneously on multiple servers, New Relic sends one alert when the first problem occurs. New Relic also sends one notice of recovery when all critical server problems end.

Application problems (Apdex and error rate thresholds, application downtime)

New Relic sends individual alerts for the associated application alert policy. When all critical problems for an application end, New Relic sends a final, closing notification.

Key transaction problems (Apdex and error rate thresholds)

New Relic sends individual alerts for the associated key transaction alert policy. When all critical problems end, New Relic sends a final, closing notification.

Testing alerts

To test your key transaction alert conditions:

  1. Adjust your alert settings so that all status indicators turn green.
  2. Wait 15 minutes, and then view the Recent Events list to verify there are no open incidents.
  3. Lower one of the settings to a value that should cause an incident, and wait the appropriate interval.
  4. Look for a new incident on the Recent Events list and for any alert notifications you have set up (such as email).

For more help

Additional documentation resources include:

If you need additional help, get support at support.newrelic.com.