As part of Applied Intelligence, Incident Intelligence helps you correlate your incidents and reduce noise in your environment. It gives you an overview of all your incidents, their sources, and related events.
To use Incident Intelligence and Applied Intelligence, as well as the rest of our observability platform, join the New Relic family! Sign up to create your free account in only a few seconds. Then ingest up to 100GB of data for free each month. Forever.
After you set up Incident Intelligence, our system will begin finding issues from your data sources.
In the issue feed, you can find an overview of all your issues, along with helpful information about them. You can also click any individual issue for more detail, including its analysis summary, event log, and details about correlated issues.
This screenshot shows an example issue feed, which describes your issues' statuses, correlations, and more.
What's the difference between an issue, incident, and event? In short, these terms are like building blocks. Events are raw data from your sources. Incidents are made up of one or more events. Issues are composed of one or more incidents.
In more detail:
- Events indicate a state change or trigger defined by your monitoring systems. An event contains information about the affected entity, and they are almost always triggered automatically by the system.
- Incidents are groups of events that describe the "symptoms" of your system over time. These symptoms are detected by your monitoring tools, which evaluate your data streams and events.
- Issues are groups of incidents that describe the underlying problem of your symptoms. When a new incident is created, Incident Intelligence opens an issue and evaluates other open issues for correlations.
To enable Incident Intelligence, follow these four steps. Afterwards, issues should start to appear in your issue feed.
- 1. Configure your environment (one-time).
- 2. Configure sources.
- 3. Configure destinations.
- 4. Configure pathways.
To set up an environment in Incident Intelligence, you need an administrator to select a New Relic account for it. This account should be the one your team is using.
- Who sets the environment? Only administrators, and only for accounts where they have admin privileges.
- Can administrators set more than one environment? They can set one environment per master account and its sub-accounts. More than one can be set if an administrator has privileges for more than one master account.
- Need to change the environment's associated account? Reach out to your account executive or our support team for help.
Incident Intelligence is a cross-account product. This means you can send in data from any New Relic account or external source to correlate events.
After setting up your environment, determine your incident sources. These are your data inputs.
You can get data from any of the following sources:
Now that you've set up your sources, you can configure your destinations. These are the data outputs where you view your incidents.
You can set destinations using any of the following methods:
For examples of destination templates, webhook formats, and JSON schema, see the Incident Intelligence destination examples.
To control when and where you want to receive notifications from your incidents, you can configure pathways.
To add a pathway:
Go to one.newrelic.com, click Alerts & AI, in the left nav under Incident Intelligence click Pathways, then click Add a pathway.
In the query builder box, select an attribute, such as
- This can be from the list of all attributes available in PagerDuty incidents and New Relic alerts violations, or you can add your own attributes.
Select a logical operator. For example,
Enter a specific value to complete the logical expression.
- To include all issues created by your sources, select Send everything. (Use this if you only use one PagerDuty service to manage all incidents.)
- To build more complex logic, use the
Select one or more of your destinations.
To edit or remove existing pathways, mouse over the pathway's name on the Pathways page.
The visual timeline, as presented below, shows you a breakdown of:
- The trends taking place
- What incidents are active
- What incidents are resolved
- What is correlated to each other
- Various milestones at different issue levels
In addition, you’ll see a grey line at the top of the timeline. In comparison to the visual timeline that shows the changes to each incident, the grey line represents changes to the issue.
Mouse over the grey line to see details of the event.
Finally, mouse over the incident to see information on the location, timing, and level of importance of a specific incident.
This figure shows a particular incident populated on January 11th with a level of Critical.
To view the issues in a text format, in the right hand corner, click Switch to issue log view.
The related activity section aggregates a set of incidents into a single issue, according to a rule-based system.
This section will show you the Last Update, the Source location, its State, the number of Related Events, and where it Originated. You can also copy the Payload or click on Analyze for more information.
The issue page is built to first provide the user with bottom line insights to understand the problem and minimize the time needed to resolve it.
The following outlines each of the four sections on the issue page:
- The Analysis Summary: the analysis summary has two machine learning modules, the golden signals and the related components.
- The Suggested Responder: the suggested responder will tell you who to potentially reach out to on your team to solve a specific problem.
- The Impacted Entities: an entity is anything that has data you can monitor. Specifically, these are focused on incidents from New Relic sources, extracting the entities and providing a summary. Each entity is unique.
- The Label Sets: label sets are focused on incidents that come from 3rd party sources, such as PagerDuty, AWS Cloudwatch, REST APIs, etc., as well as for NRQL queries. They come in the form of key:value pairs.
All four of these sections can show up together for each issue or separately. It will vary based on the data in the issue.
In addition to these four sections, you can also take a look at anomaly overview and entity overview directly from the issue feed.
If you hover over an impacted entity application, you’ll notice both call to actions: anomaly overview and entity overview. Anomaly overview will open the application's anomalies page. This is only available for applications that are set up for Proactive Detection.
Finally, the issue page contains deployment events.
APM’s Deployment page lists recent deployments and their impact on your end user and app server's Apdex scores, response times, throughput, and errors. This section will only show up if New Relic has identified applications under the Impacted Entities that have deployments.
There are two types of deployment events: deployments and related deployments. Click “Show all deployments”, to see all your deployment events when they arrive. Click a specific deployment to see its APM deployments page.
Root cause analysis automatically finds potential causes for an issue and its impacted entities. It shows you why open issues occurred, which deployments contributed, and relevant error logs and attributes. With this, you can investigate the problem and reduce your mean time to resolution (MTTR).
Note that root cause analysis is dependent on other New Relic data sources and features. This is why root cause analysis information may not always be present for every issue.
When you select an issue, you may see Root cause analysis information.
Root cause analysis includes three main UI sections:
- Deployment events: When you set up deployments, we provide the deployment nearest to the issue creation. Changes, such as deployments, account for a high percentage of the root causes of incidents and having that information at hand can help diagnose and resolve issues.
- Error logs: You can explore millions of log messages with a single click and use manual querying to help you find anomalous patterns and hard-to-find problems.
- Attributes to investigate: We scan the distribution of attributes and surface possible causes by finding significant changes in the distribution. For example, for every single transaction event, we can scan to see if an individual user starts to take up an anomalous share of the requests sent to your app. You can also query interesting attributes.
To further reduce noise or get improved incident correlation, you can change or customize your decisions. Decisions determine how Incident Intelligence groups incidents together.
To get started, see Decisions.
If you’re using PagerDuty or New Relic alerts violations as your incident notification tools, Incident Intelligence suggests relevant team members that can help resolve your issues.
Incident Intelligence learns from your PagerDuty and alerts violations data to provide suggestions for each new incident. Once you receive a suggestion, you can contact the responder or search for relevant documentation that person may have written.
To get started, enable PagerDuty or alerts violations as a source for Incident Intelligence. Afterwards, you can view the suggestions in two places:
- The issue feed, where you can also provide feedback on the suggestions.
- Directly within PagerDuty (both UI and API.) If you’re also using PagerDuty as a destination, the suggestions will appear in your issue notifications payload.
This feature doesn't account for on-call availability at the time of incident.
In order to train the model, we use the information PagerDuty provides about individuals. We ingest incident information only, not users’ contact details.
New Relic's Incident Intelligence service is performed solely in the United States. By using New Relic Incident Intelligence, you agree that New Relic may move your data to, and process your data in, the US region. This applies whether you store your data in New Relic's US region data center or in our EU region data center.
If you elect to use the Suggested Responder feature and manage EU-based individuals, you may need to confirm that an appropriate data processing agreement is in place.