Much of the data stored within New Relic relates to other data: Transaction
and TransactionError
, PageView
and PageAction
, Log
and infrastructure events, and more. You can perform analysis and calculate correlations between these events using subquery joins.
How to write a subquery join
A subquery is a query that is nested inside another query. With subquery joins, you can combine the result of a subquery with the result of its outer query based on a key, allowing for analysis and enrichment across datasets.
A subquery join requires three components: two datasets and a primary key to link the two together.
FROM Event [INNER|LEFT] JOIN (subquery) ON [key =] key SELECT ...
Subquery joins contain simple rules to the syntax:
- The
JOIN
clause must always follow immediately after theFROM
clause. - You can prefix the
JOIN
with the join type.INNER
orLEFT
is optional, and defaults toINNER
when omitted. - Parentheses containing a subquery must immediately follow the
JOIN
clause. - The
ON
clause must immediately follow the subquery and has two forms (more details below).
You can have multiple JOIN
clauses in one query, too. For example, this query uses two JOIN
s within subqueries:
FROM JavaScriptError JOIN ( FROM PageAction JOIN ( FROM PageView SELECT count(*) FACET session as pageViewSession, city LIMIT MAX ) ON session = pageViewSession SELECT count(*) FACET city, currentUrl, session as pageActionSession ) ON session = pageActionSessionSELECT count(*) FACET city, currentUrl, session, errorClass
The image below contains two datasets: the average CPU percentage of infrastructure containers (ProcessSample
) and the average duration of app transactions by container.
Often, data from different sources is correlated. In this case, you can determine if a container's higher CPU usage is causing slower transactions by using the following subquery join:
FROM Transaction JOIN (FROM ProcessSample SELECT average(cpuPercent) AS cpu FACET containerId LIMIT MAX) ON containerIdSELECT average(duration)/latest(cpu) FACET containerId, containerName
With this query, you can see the containers that have a higher average transaction duration given their CPU usage, and investigate outliers to see if there is a bug to fix or optimizations to be made.
Subquery join limitations
Subquery joins have the following limitations:
- The joined subquery will continue to have a default
LIMIT
of 10, with a maximumLIMIT
of 5,000. Note that the outer query'sLIMIT
does not affect the inner query. - The use of
TIMESERIES
in the joined subquery is not supported. If your outer query usesTIMESERIES
, keep in mind that the joined subquery will provide a single result for the full query timespan. - The use of
COMPARE WITH
in the joined subquery is not supported. If your outer query usesCOMPARE WITH
, keep in mind that the joined subquery will provide a single result based on the queries base timespan, and will not provide a separate value for the outer queries compare with timespan. - Like all subqueries, joined subqueries cannot be used in alert conditions.
- While
SELECT
* is supported in the parent query, it is not supported in the joined subquery. - The cardinality of the join is limited to 1:100, meaning a single join key cannot map to more than one hundred rows in the subquery result.
- The
ON
clause only supports equality conditions. - The
JOIN
key cannot be a complex attribute, like a metric value. - We do no coercion of attribute types in the
JOIN
condition. The left side of theJOIN
'sON
condition needs to be the same type as the right side of theON
condition. - Metric wildcards are not supported in the
JOIN
'sON
condition - The subquery cannot be a metric row-wise query.
- The right-hand side of the
JOIN
's ON condition must be an identifier projected by the query. It cannot use a function or mathematical operation. - The joined subquery cannot project a
uniques()
result.
Subquery join examples
Here are some example subquery joins: