REST API for Applied Intelligence

The incident events API allows you to report any related activities from your own proprietary incident management systems for advanced correlation and reasoning. The API is designed in a generic way with minimal required attributes and flexibility to keep native keys and values of your incident events.

The more information your system can provide, the better our decision engine works to surface more relevant information to you. Use the attributes field to push your specific information in addition to the required fields.

Authentication

The REST API supports secure token-based authentication and accepts JSON content as input. To get your secure token, go to one.newrelic.com, click Alerts & AI, in the left nav under Incident Intelligence click Sources, then click REST API.

Your user account must have permissions to manage Sources to get a secure token.

Batching data

We currently support up to 10 events in the same API call. In order to batch, just append new event data to the body.

A batch must be POSTed as {"events": [{"event_source": "Snap", ...}]} with each event part in a list that is the value of "events", as the following sample shows.

Sample JSON

{
    "events": [{
        "application": "Name of my application",
        "attributes": {
            "alert/description": "Add a description about the alert itself",
            "state": "alarm",
            "application/state": "MAJOR",
            "environment": "prod"
        },
        "event_description": "Add a description about the incident",
        "event_source": "List the application that created the incident",
        "host": "host-name",
        "value": "medium"
    }]
}

Our collector will first check to see if events is the key in the root object and if so, extract and iterate.

Default data size limits

  • Batch – up to 10 messages inside the body.

  • We support up to 20 attributes per metric or event.

  • Each string value field size is limited to 1024 characters.

  • Each attribute name field size is limited to 128 characters.

If you have different requirements for data size/restrictions please contact us as those parameters can be tuned for you.

Make API calls

On the Incident Intelligence Sources page, click REST API. Choose the correct account and click on the clipboard icon to copy the collector URL. The security token should be used in the Authorization: Bearer HTTP header.

Here is an example curl command using this interface:

curl -L -X POST 'https://collectors.signifai.io/v1/incidents' -H 'Authorization: Bearer XXXXXXXXX' -H 'Content-Type: application/json' --data-raw '{"application": "A Unique App Name","attributes": {"alert/description": "The health check end-point is failing for A Unique App Name","annotations/description": "Health check failure","cluster/name": "auan_001","datacenter/name": "US-EAST-1","health_check/entity_name": "a-unique-app-name","service/name": "a-unique-app-name","service/status": "down","state": "alarm","label/namespace": "use labels to control how events aggregate into incidents"},"event_description": "Something in the health check failed for A Unique App Name","event_source": "my-rest-api","value": "critical"}'

API specifications

Method

API Endpoint

Description

POST

https://collectors.signifai.io/v1/incidents

Sends incident events to Applied Intelligence for processing.

Field Description

event_source

String

REQUIRED. Your own representation for the system or application that is generating the event.

Example: {"event_source": "sensu"}

host|service|application

String

REQUIRED. What generated the event. Can be the associated host or, if a host isn't relevant, a service or an application. Note: only one is required.

Example: {"host": "payments001"}

value

String, Boolean (only true is supported)

REQUIRED. Incident priority. String must be one of: critical, high, medium, low
Use a boolean to indicate an event happened

Example: {"value": "high"}

timestamp

Long

Epoch time in seconds (UTC) that the event occurred. Negative timestamps are not supported.

Default: the time our server receives the event

Example: {"timestamp": 1591302334}

event_description

String

REQUIRED. Free text information that describes this event. We recommend describing what happened on which entity. This is used in various places in the UI. This attributes is required.

Example: {"event_description": "Response time is > 2 seconds for the last 10min on [Production] Storefront"}

attributes

JSON Object

This is a flat key/value mapping of additional attributes about the incident event. We recommend putting as much information and meta data as you can here.

Any attributes added here will be used to perform better correlations.

Labels are defined within the attributes object. Labels allow you to control how the de-duplication step works for your events. See De-duplication and identification below.

Example:

{"attributes": {

"affected_cluster": "cluster foo",

"team": "SRE Infra Team",

"environment": "production"

}}

De-duplication and identifications

To better support our system’s advanced de-duplication and correlation capabilities, we recommend supplying a set of labels or an alert ID, as part of the incident’s attributes.

Any attribute prefixed with label/ is considered as a label.

The combination of the labels, event_source, host, application, and server are used to create unique incidents. If labels are not provided, provide an alert/id attribute to be used as a de-duplication key.

Significant incident attributes (hints)

Some attributes have enhanced capabilities. Sending these allows us to process the data more intelligently.

Attribute name Description
alert/description

Additional details that describe what happened.

alert/id

Alert identification - used to deduplicate incidents.

Every incident that is received with the same alert/id will be aggregated to the same open incident.


Important: Specifying alert/id manually will deactivate New Relic's automatic creation of an alert ID using the defined labels.

alert/metric_name

Metric name that triggered the incident.

alert/policy_id

Policy id of the alert.

alert/policy_name

Policy name of the alert.

annotations/description

The incident description that will be used for representation in the UI.

application/id

Instrumented application identifier.

application/name

Instrumented application name assigned from the application field in the payload.

availability_zone

Instrumented incident availability zone.

cloud/region

Region hosting the service (for example: us-east-1).

cluster/name

Identify the cluster name impacted.

datacenter/name

Identify the datacenter name impacted.

environment

Environment kind (dev, prod).

health_check/entity_name

The name of the health check originally triggered the incident if available.

health_check/id

Health check identification.

health_check/name

Health check name.

health_check/probe_id

Identification of the probe id used for health check.

health_check/probe_name

Name of the probe name used for health check.

health_check/type

Health check type.

host/name

Instrumented host's name assigned from the host field in the payload.

instance/name

Instrumented instance name.

instance/type

Instrumented instance type.

label/*

Any attribute prefixed with label/ will be considered as part of the incident de-duplication key and will be used to identify the incident.

organization/name

Organization name (in cases there are more than one).

runbook_url

In case there is a link for a runbook.

service/name

Service name that reported the incident. Assigned from the service field in the payload.

service/status

Instrumented service status.
Can be either up or down.

state

Indicates what state the event is in. Acceptable values are: alarm and ok.

The default value is alarm if state is not provided which will create an incident or update an existing incident with the same labels or alert/id.

If the value is ok this will close the open incident with the same labels or alert/id.

For more help

If you need more help, check out these support and learning resources: