Best practices for alert policies

Alert policies provide containers for conditions against one or more New Relic monitored resources. They also define who gets notified (through alert channels) when those conditions are violated.

You can structure your alert policies in a number of different ways to fit your organization's needs. This document describes some best practices and examples to help you get the most out of your configuration with New Relic Alerts (alerts.newrelic.com).

You may also want to familiarize yourself with common terminology used with New Relic Alerts.

Define policies for entities or people

When designing your alert policies, consider:

  • The parts of your architecture that need personnel to be responsible for them
  • The individuals who are responsible for one or more parts of your infrastructure

For example, an organization has multiple apps monitored by New Relic APM, New Relic Browser, New Relic Infrastructure, and New Relic Synthetics.

  • Software developers may need alert notifications for both front-end and back-end performance, such as webpage response time and page load JavaScript errors.
  • Operations personnel may need alert notifications for poor back-end performance, such as server memory and load averages.
  • The product owner may need alert notifications for positive front-end performance, such as improved end user Apdex scores or sales being monitored by New Relic Insights.

The more conditions you define, the more effective the incident rollup will be. Key personnel will receive actionable alert notifications for the metrics that matter to them, and overall, the organization will be able to identify and respond to trends or patterns more efficiently.

Decide how many alert notifications

The more alert conditions you define, the more incidents can be triggered and monitored. For example, your organization may need an alerting solution to accommodate extensive IT systems. Use New Relic Alerts to create alert policies with multiple conditions for multiple entities (targets) that trigger notifications through one or more notification channels.

On the other hand, your organization may not need an extensive alerting structure. The fewer alert conditions you define, the fewer incidents this will trigger. For example, you could create a simple alert policy with an email notification channel to cover basic alerting scenarios.

Set thresholds for conditions

Set the thresholds for your alert policy's conditions to meaningful levels for your environment. Here are some suggested guidelines.

Alerting thresholds Recommendations
Set threshold levels Avoid setting thresholds too low. For example, if you set a CPU alerting threshold of 75% for 5 minutes on your production servers, and it routinely goes over that level, this will increase the likelihood of un-actionable alerts or false positives.
Experimenting with settings You do not need to edit files or restart software, so feel free to make quick changes to your threshold levels and adjust as necessary.
Adjust settings

Adjust your conditions over time.

  • As you use New Relic products to help you optimize your entity's performance, tighten your New Relic Alerts policy conditions to keep pace with your improved performance.
  • If you are rolling out something that you know will negatively impact your performance for a period of time, loosen your threshold conditions to allow for this.
Disable settings You can disable any alert condition in a policy. This is useful, for example, if you want to continue using other alert conditions for the policy while you experiment with other metrics or thresholds.

Select notification channels

With New Relic Alerts, you can create notification channels first and then assign alert policies to them. You can also create alert policies first and then assign notification channels to them. This flexibility allows you to tailor who gets notified, using the method that is most useful to them.

For example, you could:

  • Identify your operations team's HipChat room as a general level of alerting, and use the on-call PagerDuty contact as an after-hours or escalated level of alerting.
  • Create webhooks with customized messages for a variety of situations or personnel.

Avoid interrupting individuals who may increasingly ignore any alert. By tailoring notifications to the most useful channel and policy, you can help the right personnel receive and respond to incidents they care about in a systematic way.

For more help

Additional documentation resources include:

  • New Relic Alerts (getting started with New Relic Alerts (alerts.newrelic.com) and learning how to get the most out of its operational potential with the New Relic products you use)
  • Alert policy workflow (overview of how to create an alert policy, including the targets it identifies, the conditions and thresholds that trigger the alert, the notification channels to use, and the incident preference to roll up policy violations)
  • Identify entities without alert policies (checking whether the health status for an app includes a light green bar with an icon)

Recommendations for learning more: