• /
  • Log in
  • Free account

Create NRQL alert conditions

You can use NRQL queries to create alert conditions. Once you've defined your signal, you can further define your warning and critical threshold levels. This determines when an alerts violation is created.

Read on to learn more about how to do this.

A screenshot of an example NRQL condition and generated results.

Go to one.newrelic.com, click Alerts & AI, in the left sidebar click Policies, select a policy, then Add a condition. Click NRQL, and then Next, define thresholds.

Tip

For more information on key concepts relating to NRQL alert conditions and streaming alerts, see Streaming alerts: key terms and concepts.

Create a NRQL alert condition

To create a NRQL alert condition for a policy:

  • On one.newrelic.com, in the header click Alerts & AI, then in the left sidebar click Policies.
  • Select an existing policy or click New alert policy to create a new policy.
  • Click Add a condition.
  • Under Select a product click NRQL, and then click Next, define thresholds.

NRQL alert syntax

Here's the basic syntax for creating all NRQL alert conditions. The FACET clause is required for Outlier threshold types, optional for Static, and not allowed for Baseline.

SELECT function(attribute)
FROM Event
WHERE attribute [comparison] [AND|OR ...]

Clause

Notes

SELECT function(attribute)

Required

Supported functions that return numbers include:

  • apdex

  • average

  • count

  • latest

  • max

  • min

  • percentage

  • percentile

  • sum

  • uniqueCount

    Tip

    If you use the percentile aggregator in a faceted alert condition with many facets, this may cause the following error to appear:

    An error occurred while fetching chart data.

    If you see this error, use average instead.

FROM data type

Required

Only one data type can be targeted.

Supported data types:

  • Event
  • Metric (RAW data points will be returned)

WHERE attribute [comparison] [AND|OR ...]

Use the WHERE clause to specify a series of one or more conditions. All the operators are supported.

FACET attribute

Required for outlier conditions, but not baseline or static

Include an optional FACET clause in your NRQL syntax depending on the threshold type: static, baseline, or outlier.

Use the FACET clause to separate your results by attribute and alert on each attribute independently. No LIMIT clause is allowed, but all queries will receive the maximum number of facets possible. Faceted queries can return a maximum of 5000 values for static conditions and a maximum of 500 values for outlier conditions.

Important

If the query returns more than the maximum number of values, the alert condition can't be created. If you create the condition and the query returns more than this number later, the alert will fail. Modify your query so that it returns a fewer number of values.

Reformatting incompatible NRQL

Some elements of NRQL used in charts don’t make sense in the streaming context of alerts. Here’s a list of the most common incompatible elements and suggestions for reformatting a NRQL alert query to achieve the same effect.

Element

Notes

SINCE and UNTIL

Example:

SELECT percentile(largestContentfulPaint, 75) FROM PageViewTiming WHERE (appId = 837807) SINCE yesterday

NRQL Alerting produces a never-ending stream of windowed query results, so the SINCE and UNTIL keywords to scope the query to a point in time are not compatible. As a convenience, we automatically strip SINCE and UNTIL from a query when creating a NRQL Alert Condition from the context of a chart.

TIMESERIES

In NRQL queries, the TIMESERIES clause is used to return data as a time series broken out by a specified period of time.

For NRQL alerts, the equivalent property of a signal is the aggregation window.

histogram()

The histogram() aggregation function is used to generate histograms.

histogram() is not compatible with NRQL alerting: histogram aggregations can not be formatted as a time series. To create an alert from a portion of a histogram (e.g. 95th percentile), use the percentile() aggregation function.

Multiple Aggregation Functions

Each alert condition can only target a single aggregated stream of data. To alert on multiple streams simultaneously, you’ll need to decompose them into individual conditions within the same policy.

Original Query:

SELECT count(foo), average(bar), max(baz) from Transaction

Decomposed:

SELECT count(foo) from Transaction
SELECT average(bar) from Transaction
SELECT max(baz) from Transaction

COMPARE WITH

The COMPARE WITH clause is used to compare the values for two different time ranges. This type of query is incompatible with NRQL alerting. We recommend using a Baseline Alert Condition to dynamically detect deviations for a particular signal.

SLIDE BY

The SLIDE BY clause supports a feature known as sliding windows. With sliding windows, SLIDE BY data is gathered into "windows" of time that overlap with each other. These windows can help to smooth out line graphs with a lot of variation in cases where the rolling aggregate (such as a rolling mean) is more important than aggregates from narrow windows of time.

Sliding windows are not currently supported in NRQL alerts.

LIMIT

In NRQL queries, the LIMIT clause is used to control the amount of data a query returns, either the maximum number of facet values returned by FACET queries or the maximum number of items returned by SELECT * queries.

LIMIT is not compatible with NRQL alerting: evaluation is always performed on the full result set.

NRQL alert threshold examples

Here are some common use cases for NRQL alert conditions. These queries will work for static and baseline threshold types. The outlier threshold type will require additional FACET clauses.

Alert conditions and query order of operations

By default, the aggregation window is 1 minute, but you can change the window to suit your needs. Whatever the aggregation window, New Relic will collect data for that window using the function in the NRQL alert condition’s query. The query is parsed and executed by our systems in the following order:

  1. FROM clause – which event type needs to be grabbed?
  2. WHERE clause – what can be filtered out?
  3. SELECT clause – what information needs to be returned from the now-filtered data set?

Example: null value returned

Let's say this is your alert condition query:

SELECT count(*) FROM SyntheticCheck WHERE monitorName = 'My Cool Monitor' AND result = 'FAILURE'

If there are no failures for the aggregation window:

  1. The system will execute the FROM clause by grabbing all SyntheticCheck events on your account.
  2. Then it will execute the WHERE clause to filter through those events by looking only for the ones that match the monitor name and result specified.
  3. If there are still events left to scan through after completing the FROM and WHERE operations, the SELECT clause will be executed. If there are no remainig events, the SELECT clause will not be executed.

This means that aggregators like count() and uniqueCount() will never return a zero value. When there is a count of 0, the SELECT clause is ignored and no data is returned, resulting in a value of NULL.

Example: zero value returned

If you have a data source delivering legitimate numeric zeroes, the query will return zero values and not null values.

Let's say this is your alert condition query, and that MyCoolEvent is an attribute that can sometimes return a zero value.

SELECT average(MyCoolAttribute) FROM MyCoolEvent

If, in the aggregation window being evaluated, there's at least one instance of MyCoolEvent and if the average value of all MyCoolAttribute attributes from that window is equal to zero, then a 0 value will be returned. If there are no MyCoolEvent events during that minute, then a NULL will be returned due to the order of operations.

Tip

For more information about this topic, you can check out our blog post on troubleshooting for zero versus null values.

Tip

You can determine how null values will be handled by adjusting loss of signal and gap filling settings in the Alert Conditions UI.

Nested aggregation NRQL alerts

Nested aggregation queries are a powerful way to query your data. However, they have a few restrictions that are important to note.

NRQL condition creation tips

Here are some tips for creating and using a NRQL condition:

Topic

Tips

Condition threshold types

NRQL condition threshold types include static, baseline, and outlier.

Create a description

For NRQL conditions, you can create a custom description to add to each violation. Descriptions can be enhanced with variable substitution based on metadata in the specific violation.

For details, see Description

Query results

Queries must return a number. The condition evaluates the returned number against the thresholds you've set.

Time period

As with all alert conditions, NRQL conditions evaluate one single minute at a time. The implicit SINCE ... UNTIL clause specifying which minute to evaluate is controlled by your Evaluation offset setting. Since very recent data may be incomplete, you may want to query data from 3 minutes ago or longer, especially for:

  • Applications that run on multiple hosts.

  • SyntheticCheck data: Timeouts can take 3 minutes, so 5 minutes or more is recommended.

    Also, if a query will generate intermittent data, consider using the sum of query results option.

Lost signal threshold (loss of signal detection)

You can use loss of signal detection to alert on when your data (a telemetry signal) should be considered lost. A signal loss can indicate that a service or entity is no longer online or that a periodic job failed to run. You can also use this to make sure that violations for sporadic data, such as error counts, are closed when no signal is coming in.

Tip

To learn more about signal loss and how to request access to it, see this announcement.

Advanced signal settings

These settings give you options for better handling continuous, streaming data signals that may sometimes be missing. These settings include the aggregation window, the evaluation offset, and an option for filling data gaps. For more on using these, see Advanced signal settings.

Condition settings

Use the Condition settings to:

Limits on conditions

See the maximum values.

Health status

NRQL alert conditions don't affect an entity's health status display.

Examples

For more information, see:

Alert threshold types

When you create a NRQL alert, you can choose from different types of thresholds:

NRQL alert threshold types

Description

Static

This is the simplest type of NRQL threshold. It allows you to create a condition based on a NRQL query that returns a numeric value.

Optional: Include a FACET clause.

Baseline (Dynamic)

Uses a self-adjusting condition based on the past behavior of the monitored values. Uses the same NRQL query form as the static type, except you can't use a FACET clause.

Outlier

Looks for group behavior and values that are outliers from those groups. Uses the same NRQL query form as the static type, but requires a FACET clause.

Sum of query results (limited or intermittent data)

Important

Available only for static (basic) threshold types.

If a query returns intermittent or limited data, it may be difficult to set a meaningful threshold. Missing or limited data will sometimes generate false positives or false negatives. You can use loss of signal, aggregation duration, and gap filling settings to minimize these false notifications.

To avoid this problem when using the static threshold type, you can set the selector to sum of query results. This lets you set the alert on an aggregated sum instead of a value from a single harvest cycle. Up to two hours of one-minute data checks can be aggregated. The duration you select determines the width of the rolling sum and the preview chart will update accordingly.

Set the loss of signal threshold

Loss of signal occurs when no data matches the NRQL condition over a specific period of time. You can set your loss of signal threshold duration and and also what happens when the threshold is crossed.

signal-loss-ui.png

Go to one.newrelic.com, click Alerts & AI, in the left sidebar click Policies, select a policy, then Add a condition. Loss of signal is only available for NRQL conditions.

You may also manage these settings using the GraphQL API (recommended), or the REST API. Go here for specific GraphQL API examples.

Loss of signal settings:

Loss of signal settings include a time duration and two possible actions.

  • Signal loss expiration time
    • UI label: Signal is lost after:
    • GraphQL Node: expiration.expirationDuration
    • Expiration duration is a timer that starts and resets when we receive a data point in the streaming alerts pipeline. If we don't receive another data point before your 'expiration time' expires, we consider that signal to be lost. This can be because no data is being sent to New Relic or the WHERE clause of your NRQL query is filtering that data out before it is streamed to the alerts pipeline.
    • The loss of signal expiration time is independent of the threshold duration and triggers as soon as the timer expires.
    • The maximum expiration duration is 48 hours. This is helpful when monitoring for the execution of infrequent jobs. The minimum is 30 seconds, but we recommend using at least 3-5 minutes.
  • Loss of signal actions Once a signal is considered lost, you can close open violations, open new violations, or both.
    • Close all current open violations: This closes all open violations that are related to a specific signal. It won't necessarily close all violations for a condition. If you're alerting on an ephemeral service, or on a sporadic signal, you'll want to choose this action to ensure that violations are closed properly. The GraphQL node name for this is "closeViolationsOnExpiration"
    • Open new violations: This will open a new violation when the signal is considered lost. These violations will indicate that they are due to a loss of signal. Based on your incident preferences, this should trigger a notification. The graphQL node name for this is "openViolationOnExpiration"
    • When you enable both actions, we'll close all open violations first, and then open a new violation for loss of signal.

To create a NRQL alert configured with loss of signal detection in the UI:

  1. For a policy, when you create a condition, under Select a product, click NRQL, then click Next, define thresholds.
  2. Write a NRQL query that returns the values you want to alert on.
  3. For Threshold type, select Static or Baseline.
  4. Click + Add lost signal threshold, then set the signal expiration duration time in minutes or seconds in the Signal is lost after field.
  5. Choose what you want to happen when the signal is lost. You can check one or both of Close all current open violations and Open new "lost signal" violation . These control how loss of signal violations will be handled for the condition.
  6. Make sure you name your condition before you save it.

Tip

Loss of signal detection doesn't work on NRQL queries that use nested aggregation or sub-queries.

Advanced signal settings

screenshot_advanced_signal_settings.png

When creating a NRQL alert condition, the advanced signal settings gives you better control over streaming alert data and helps you avoid false alarms.

When creating a NRQL condition, there are several advanced signal settings:

  • Aggregation window
  • Evaluation offset
  • Fill data gaps

To read an explanation of what these settings are and how they relate to each other, see Streaming alerts concepts. Below are instructions and tips on how to configure them.

Aggregation window

You can set the aggregation window duration to choose how long data is accumulated in a streaming time window before it's aggregated. You can set it to anything between one second and 15 minutes. The default is one minute.

Tip

Baseline alert condition thresholds don't support editing the aggregation window. They use the 1 minute default.

Evaluation offset

You can adjust the evaluation offset to coordinate our streaming alerting algorithm with your data's latency. If it takes a while for your data to arrive, then you may need to increase the evaluation offset.

The total supported latency is the product of the aggregation window duration multiplied by the evaluation offset. In the example screenshot above, the supported latency is 3 minutes (a 1-minute aggregation window multiplied by three windows).

If the data type comes from an APM language agent and is aggregated from many app instances (for example, Transactions, TransactionErrors, etc.), we recommend using an evaluation offset of 3 with 1 minute aggregation windows.

Important

When creating NRQL conditions for data collected from Infrastructure Cloud Integrations such as AWS Cloudwatch or Azure, we recommend that you start with an evaluation offset of 15 minutes, then adjust up or down depending on how long it takes to collect your data.

Fill data gaps

Gap filling lets you customize the values to use when your signals don't have any data. You can fill gaps in your data streams with one of these settings:

  • None: (Default) Choose this if you don't want to take any action on empty aggregation windows. On evaluation, an empty aggregation window will reset the threshold duration timer. For example, if a condition says that all aggregation windows must have data points above the threshold for 5 minutes, and 1 of the 5 aggregation windows is empty, then the condition won't be in violation.
  • Custom static value: Choose this if you'd like to insert a custom static value into the empty aggregation windows before they're evaluated. This option has an additional, required parameter of fillValue (as named in the API) that specifies what static value should be used. This defaults to 0.
  • Last known value: This option inserts the last seen value before evaluation occurs. We maintain the state of the last seen value for 2 hours.

Options for editing data gap settings:

  • In the NRQL conditions UI, go to Condition settings > Advanced signal settings > fill data gaps with and choose an option.
  • If using our Nerdgraph API (preferred), this node is located at: actor : account : alerts : nrqlCondition : signal : fillOption | fillValue
  • NerdGraph is the preferred API but if you are using our REST API, you can find this setting in the REST API explorer under the "signal" section of the Alert NRQL conditions API.

For more help

If you need more help, check out these support and learning resources:

Create issueEdit page
Copyright © 2021 New Relic Inc.