This feature is currently in beta.
After you create your set of SLIs and SLOs, New Relic will start generating SLI data. The first results will take a few minutes to appear in the New Relic One UI. Find service levels:
- At the top nav bar, under Service Levels in the More menu (which you can customize). Here you can filter the SLIs by entity tags.
- At the previews of those entities that have an SLI defined. You can find them all around the UI. For instance, click on an entity from the Explorer's Navigator view.
- In APM services, at the reports section.
- In any workload that contains the SLI entity, such as an APM service or browser application. If you want to group SLIs under a certain workload, make sure to add the APM service or browser app to an existing workload, or create a new one.
Click on any SLI to open the SLI card, which contains:
- The entity the SLI refers to, and the SLI name.
- Each row represents an SLO with:
- Target and time window.
- Compliance in the SLO period.
- Remaining requests error budget.
Request-based SLOs are determined from SLIs defined as the ratio of the number of good responses to the total number of requests. This means a request-based SLO is met when that ratio meets or exceeds the goal for the SLO compliance period.
- If the SLO row has a green background, you’re doing good for the period. You may have not served successfully 100% of the requests, but you still have some remaining error budget to consume.
- If the SLO row has a yellow background, your error budget is closer to being totally consumed, and you should be more cautious for the rest of the period.
- If the SLO row has a red background, you’ve not reached the target SLO in this period, and you’ve consumed all of your error budget. Be careful if you need to deploy, and plan some work to improve your SLIs. You can click on the SLO to see more data about the entity, such as the golden metrics, the latest deployments, anomalies, and ongoing issues. This data can help you understand when and why you missed the SLO targets.
You can define more than one SLO for the same SLI to check how fast you’re consuming the error budget. For instance, if you have an error budget of 0,1% requests for a whole week, you may not want to consume most of it in a single day, or the SLO will be at risk for the rest of the week.
We provide SLI details for two main purposes:
- For SLO analysis: See in which time ranges the SLO targets were missed.
- For SLI/SLO configuration and fine tuning: Learn how New Relic calculated SLO values.
The SLI card contains the following charts:
These are the key concepts to analyze service levels:
- A valid request is any request that you want to count as meaningful for your SLIs.
- A good response is any response that you consider to provide a good experience (for example, the service responded in less than 2 seconds, providing a good navigation experience for the end user).
- A bad response is any response that you consider to provide a bad experience (like the service responded with a server error, interrupting the user's flow).
This chart shows the total number of valid requests that your service received, broken down by good or bad.
This chart shows the actual throughput of your service, which you can use to see if there’s any correlation between the increase of throughput and bad responses.
It's the proportion of what you consider good responses over time. The line should stay close to 100%, meaning that most requests were served successfully.
It's the ratio of good events (responses) to total events (requests), measured over the SLO compliance period. The closer to 100%, the closer your service is to meet the SLO target over the period. When this percentage goes below the SLO target, the chart will turn red: You need to put more effort in reliability.
The error budget is an alternative way to read the SLO. It indicates what percentage of requests could still have a bad response over the SLO period, without compromising the objective.
As the total amount of tolerated bad responses will vary with the request throughput, New Relic shows the percentage of remaining error budget:
- As long as the remaining error budget is above 25%, you'll see green, and your SLO is good.
- When the error budget goes below 25%, it will turn yellow. This means you’re close to burning the whole budget for the period. You may want to be more careful with new deployments and changes, and plan for some reliability work.
- Once the error budget is completely spent, it will show in red.
The last chart shows two time series: the (SLI attainment over time)[#sli-over-time], and the SLO target. When the SLI value is below the SLO target,your service is missing the SLO. Use this chart to learn in which time ranges your service missed the SLO target.
When an SLO is not compliant, you need to analyze the original data to better understand what the impact is for your customer, with a special focus on what went wrong.
At the Service Levels page, click the ... menu in any SLI and select Analyze. You'll access the query that represents the original NRDB events used to determine the bad responses which calculate the SLI attainment. You can then use the query builder to facet and filter down the unsuccessful responses by user account, client id, requesting source, etc. to better understand the cause and impact of missing your SLIs.