Distributed tracing data represents how the performance of entities across your distributed system impact each other. Insights from this data are surfaced with respect to individual entities you view, showing when their performance might be impacted by other traced entities.
The distributed trace insights component surfaces three types of performance signals from related traced entities.
- Call counts: Entities making a significantly increased number of calls through the entity you are viewing. This increase impacts the throughput on your entity.
- Exclusive duration: Entities called down to by the entity you are viewing, which are contributing significantly more latency. The latency contributed by an entity is the wall-clock time that it had one or more processes running but was not making any external or database calls.
- Error rates: Entities called down to by the entity you are viewing, which have more processes ending in errors.
The related trace entity signals view is focused on helping you identify significant performance changes that might be impacting the entity you are viewing. Traced entities are only listed if their impact on performance is relatively significant to the one you are viewing, and if that impact has increased with respect to the selected and previous time ranges. To maintain this focus, other entities that participate in traces with the one you are viewing are not shown here if their performance impact is relatively consistent or smaller.
This initial view shows you the number of traced entities that had a significant performance impact and change, broken down by performance signal type as described above. If no traced entities meet this condition for a given signal type, that signal is shown as green with a count of zero. The non-green badge colors in the related trace entity signals view are meant only to draw attention to signals of interest, and are not indicative of the health status of the entities listed.
When you click on a signal card, a list of entities is displayed with significant changes in performance impact for that signal. If an entity meets this condition for multiple signals it will be listed under each of the respective cards.
When you select a card from the initial view, you're shown a list of entities with a significant performance impact and change for the signal type you selected. Additionally, a control switch will appear above the list that allows you to toggle between the lists of each signal. In the event that you toggle to a signal with zero available insights, you will see an all clear message in place of the list.
Each item in the list contains:
- Call path direction:
Upstream | Downstreamfollowed by the average number of entities (hops) in the call path.
Upstream: Insight data is generated from call paths from the listed entity to your entity.
Downstream: Insight data is generated from call paths that follow calls from your entity and include the listed entity.
- Selected signal insights: Each item displays its performance for the selected signal at the top, including a trend arrow (up or down when data is available), signal value, and percent change of the signal value (when data is available).
- Percent change: The time window for this value is defined by a combination of the time picker and comparison with dropdown values. Percent change will not be shown when previous tracing data is unavailable; this generally occurs when the time window extends past 3 days before the present time.
- Entity health icon: This icon will change status color based on the
Alert Severitythat the listed entity reports.
- Entity: The name of the entity reporting insights that impact your entity for the selected signal.
- Non-selected signals: Each item will also report the signal value and percent change (when available) for the three signals that are not actively selected. Those signals are defined as:
- Count: The number of traced calls to or from the listed entity, when it's in a call path with your entity.
- Error rate: The percent of calls to the listed entity that returned errors (had span errors on their entry spans).
- Duration: The average duration of calls that include your entity and the listed entity in the call path.
- Exclusive: The average exclusive duration of calls that include your entity and the listed entity in the call path.
- Sparkline chart: Includes 2 timeseries of the Selected Signal value, this chart has mouse hover interactions that sync with all timeseries charts on the page. The timeseries in color covers the time window defined by the time picker while the gray timeseries presents previous data (covering the same duration) that is used for comparison.
- View Trace button: Clicking on this button will open a pane to the trace details UI. You will view a trace that contains your entity, the listed entity, and similar values for the Selected Signal. Additionally, the waterfall will render highlighted rows for spans connected to the listed entity.
Lists have an initial limit of 20 entities; in the event that there are more than 20 entities with a significant performance impact a Show More button appears at the bottom of the list.
By default, performance for the selected time in the time picker is compared to immediately preceding time, over the same duration. For example, when viewing the last 30 minutes in the time picker, performance is compared to the 30 minutes just before that time. This also allows you to select a spike in one of the other charts on the page, and see if this correlates with a significant performance impact from a related trace entity.
You may override the default compare with behavior by changing the value of the Compare with dropdown at the top of the APM Summary page. Modifying this selection will update the end of the comparison time window (without changing duration) used to calculate percent change of signal values in the following ways:
- None: The comparison time window will end at the beginning of the time picker window.
- Yesterday: The comparison time window will end one day before the beginning of the time picker window.
- Last week: The comparison time window will end 7 days before the beginning of the time picker window.
If there is no tracing data preserved in the comparison window, the distributed trace insights subheader will not contain a "compared with" clause.
If your entity is not reporting trace data, the entity insights component will present a "no data" view which includes links to our docs and a Guided Install setup.