You may find that a high-level compliance score hides specific issues occurring in a particular region, data center, or customer segment when monitoring service level objectives (SLOs). Manually creating separate service levels for each segment is time-consuming and difficult to maintain as your infrastructure grows.
Faceted service levels solve this problem. Group your SLI results by specific attributes (also known as faceting) to break down compliance and error budget, identify root causes, and pinpoint where performance bottlenecks or outages are occurring, all within a single service level definition.
What you can do
With faceted service levels, you can:
- Get granular insights: Identify specific performance issues in particular cells, regions, or environments without creating separate service levels.
- Target resource allocation: Identify underperforming areas to focus engineering efforts.
- Improve troubleshooting: Check whether a compliance drop is global or localized to a specific attribute, such as a faulty data center.
- Compare segments: View compliance and error budget for different attribute combinations side by side.
Use cases
The following scenarios demonstrate how faceted service levels can help:
- Regional performance: An e-commerce platform groups latency by
awsRegionto discover which region is underperforming despite global SLO compliance. - Cell-level troubleshooting: A cloud provider groups availability by
cell_idto identify that intermittent outages are confined to one cluster rather than the entire service. - Environment comparison: A development team groups metrics by
environmentto compare the stability of production versus staging deployments within the same SLI definition.
How faceting works
When you create a service level, you define two queries: valid events (all meaningful requests) and good/bad events (successful or failed responses). When both queries use the same event type in the FROM clause (for example, both using FROM Transaction), the data shares common attributes like awsRegion, environment, or host. You can then enable faceting to group and break down compliance and error budget by these attribute values.