This doc will give you a fairly technical explanation of our core data types, their structure, and how they're used in our features. You can use most of our features without needing to understand the underlying data structure. But having a better understanding of this can help you get data into New Relic, understand the data you see in our UI, and query your data.
First, we’ll explain the definition of metrics from a monitoring industry perspective, and then we’ll explain how New Relic handles metrics. For a list of the metrics we collect, see our documentation.
Metrics in the monitoring industry
In the software monitoring industry, a metric means a numeric measurement of an application or system. Metrics are typically reported on a regular schedule.
Two major types of metrics are:
- Aggregated data. For example: a count of events over one minute’s time, or the rate of some event per minute.
- A numeric status at a moment in time. For example: a CPU temperature reading, or a “CPU% used” status.
Metrics are relatively easy to report and store because a single record can represent a range of time. They can also be aggregated more and more over time. For example, per-minute data may be “rolled up” to per-hour aggregations after some amount of time, and eventually may be rolled up to a per-day aggregation. This approach is efficient for long-term data storage.
Metrics are a strong solution for storing data long-term, and understanding trends over time. One potential downside is that it can be difficult to do detailed analysis of older data that has been aggregated over time; when high detail is required about specific important actions, event data can be used.
Metrics at New Relic
Conceptually, "metrics" is a broad, general category. There are various ways New Relic measures and reports metrics but, in practice, when using the New Relic UI, you usually won't have to understand how exactly this happens. In our documentation, we typically will just refer to "metrics," regardless of how that data is reported, unless there's a reason you need to know more (like understanding how to query your data).
Here are some of the ways metrics are reported and stored across the New Relic platform:
In the monitoring industry, "dimensional" metrics refer to metric data that has a variety of attributes (dimensions) attached, such as duration-related attributes (start time, end time), entity ID, region, host, etc. This amount of detail allows for in-depth analysis and querying.
At New Relic, this metric data is attached to the
Metricdata type and is sent from several sources:
- Some open-source integrations, like our Prometheus and OpenCensus exporters
- Our Telemetry SDKs
- Infrastructure services
- The Metric API (the underlying API used by the above tools)
- The events-to-metrics service
To query this data and see its attributes ("dimensions"), you could use a NRQL query like:
Select * from Metric
As time passes, these metrics are increasingly aggregated into larger time buckets. This is done to optimize your ability to query data over a long period of time.
New Relic's APM, Browser, and Mobile report and display metrics in a simple data format that we refer to as metric timeslice data. A metric timeslice consists of three parts: a metric name, the segment of time the metric represents (the "timeslice"), and a numeric value (the measurement).
For example: an APM metric timeslice for time spent in a particular transaction is named
WebTransaction/URI/foo, and might have a response time of 0.793 for a one-minute time slice from 10:20am to 10:21am. These metrics usually follow a pattern like
Our agents (APM, Browser, and Mobile) can collect thousands of metric timeslices per minute for a variety of performance metrics. For example: error rate, bandwidth usage, and garbage collection time. You also have the ability to create custom metrics.
Metric timeslice data is a lightweight data type and lacks the detail that dimensional metrics have.
Ways to explore and query metric timeslice data:
- For APM: metric timeslice data is converted to dimensional metrics and can be queried via NRQL
- Use the REST API
If you want to learn more about the structure of metric timeslice data and see some examples, expand the collapser below.
- Metric timeslice examples
Here are some common metric timeslice data examples, with a focus on common ones used by Ruby applications.
New Relic tracks a variety of metrics on ActiveMerchant transactions which can be used for business analytics as well as performance monitoring. The metrics are summarized by operation as well as by gateway.
regex sample metric legend name ActiveMerchant/.* ActiveMerchant/PayJunctionGateway ActiveMerchant/gateway/.* ActiveMerchant/gateway/PayJunctionGateway/purchase PayJunctionGateway ActiveMerchant/operation/.* ActiveMerchant/operation/purchase purchase
For more information, see the ActiveMerchant website.
ActiveRecord is the Object-Relational Mapping API used by Ruby on Rails applications. The metrics shown here measure the performance of ActiveRecord's
regex sample metric legend name ActiveRecord/.*/find ActiveRecord/User/find User#find ActiveRecord/.*/save ActiveRecord/Product/save Product#save
For more information, see the API documentation for ActiveRecord.
Apdex is a measure of user satisfaction with page load times.
In Ruby on Rails applications, HTTP requests are handled by Controller actions. A Rails application has many controllers, each of which has one or more actions. When your rails application receives an http request, that request is routed to the appropriate controller and action, based on the URL of that request. That action then does whatever processing is neccesary to generate an http response, which is most often a web page, but could also be a page fragment, an xml document, or any other kind of data that is requested by the client.
The following metrics track the performance of controller actions, regardless of routing, and without taking into account any network or web server effects.
regex sample metric legend name Controller/.* Controller/Users/show /Users/show Controller/.*/(?!\(other\)).* Controller/Users/show /Users/show Controller$ Controller All Controller Actions ControllerCPU/ ControllerCPU/Users/Show /Users/show
For more information, see the API documentation for ActionController.
This metric tracks the number of errors or exceptions raised while processing requests.
regex sample metric legend name Errors/all Errors/all
External service instrumentation captures calls to out-of-process services such as web services, resources in the cloud and any other network calls. It does not include other first class back-end components such as MemCache and the database.
In Ruby applications we instrument the Net::Http library to capture all HTTP services.
regex sample metric legend name External/[^/]+/all$ External/service.example.com/all All service.example.com calls External/ External/host.aws.com/Net::Http::POST Net::Http::POST[host.aws.com] External/all$ External/all External Services External/[^/]+/(?!all)/ External/service.example.com/all All service.example.com calls
This metric represents a summary of the throughput and response time of all web requests.
regex sample metric legend name ^HttpDispatcher$ HttpDispatcher HttpDispatcher
MemCache is a popular technology that enables applications to access shared memory provided by any number of physical machines as a global cache. Applications that heavily use the database often use MemCache for performance and scalability benefits.
These metrics measure the frequency and response time of calls to MemCache to read and write data from the cache. Response times should be low (less than 5 ms) for a well performing MemCache deployment.
regex sample metric legend name MemCache/.* MemCache/read MemCache read operations MemCache/read MemCache/read MemCache read operations MemCache/write MemCache/write MemCache write operations
This metric measures the length of the mongrel queue, which holds pending http requests to be processed by mongrel. The HTTP Activity graph overlays the maximimum queue length for a given period. The value is zero if mongrel is processing a request but has no other requests waiting in its queue.
When looking at this value across an aggregate cluster of mongrels, the queue lengths of all mongrels is added together, showing the sum of all queue lengths.
A mongrel queue length should be at or near zero; if it is consistently at a higher level, then it indicates that your rails application is having trouble keeping up with its load requirements.
regex sample metric legend name Mongrel/Queue Length Mongrel/Queue Length Queue Length
ActionView is a package in Rails that is used to render the output that is the response to an http request, such as an html page or an xml document. The View is rendered by the controller that is handling the request.
If View metrics represent a large portion of your controller's response time, it could mean you are doing a lot of database operations inside the view template itself.
regex sample metric legend name View/.* View/Users/_child.html.erb/Partial Users/_child.html.erb View/.*/Partial View/Users/_child.html.erb/Partial Users/_child.html.erb View/.*/Rendering View/Users/show.html.erb/Rendering Users/show.html.erb
For more information, see the API documentation for ActionView.
A couple examples of this at New Relic:
- Our infrastructure monitoring reports many metrics that are attached to events. For example, we report a
ProcessSampleevent, which has various sample-based metrics attached to it, like CPU percentage. To learn more about infrastructure monitoring data, see Infrastructure data.
- In APM, the
Transactionevent has several metrics attached to it, including
To learn more about this data and how to query it, see Events.
- Our infrastructure monitoring reports many metrics that are attached to events. For example, we report a
Metrics can be formed by counting New Relic events, or doing some other mathematical calculation on those events. For example, if you wanted to measure the total number of
Transactionevents over the last half hour, you might run this NRQL query:
Select count(*) from Transaction since 30 minutes ago
Another example: if you wanted to compute the average response time for your service, you might run a query like:
FROM Transaction SELECT average(duration) SINCE 30 minutes ago
Some New Relic charts are generated with these kinds of queries. The downside of this approach is that there are limits on how many events a monitoring system (including ours) can report. This means that sometimes, for high-throughput systems, the count may not accurately represent the total activity on that system. To learn more about how this can be addressed, see Event limits and sampling.
Want to report custom metrics? See Get data into New Relic.
First, we’ll explain the definition of events from a monitoring industry perspective, and then we’ll explain some specifics about how New Relic handles event data.
Events in the monitoring industry
In the software industry, events can be thought of as simply “things that occur in a system.” For example, a server setting being changed would be an event. Another example: a website user clicking a mouse.
Some events will generate a stored record, and that record is typically also called an event.
Event data represents discrete occurrences and typically will have a high level of detail, so event data is suited for detailed analysis and querying. The downside to the use of event data is that there are typically so many events reported that it can become difficult to query that large dataset over longer time ranges.
Events at New Relic
At New Relic, we report events to data objects also called events. These events have multiple attributes (key-value pairs) attached. Event data is used in some UI charts and tables, and you can also query it. How long event data remains available is determined by data retention rules.
One example of an event: APM reports an event type named
Transaction, which represents a logical unit of work in an application. To see the attributes attached to this event, you could use a NRQL query like:
Select * from Transaction
For examples of querying event data, see Introduction to NRQL.
Other details about New Relic event data:
- Events can have any type of attributes attached. Some events have attributes that report metric data.
- You can report custom events.
- To increase the availability of your event data for querying/charting, you can turn events into metrics.
- Some systems generate a large number of events that exceeds collection limits and results in incomplete query results. For more on this, see Event sampling.
- Because event is a general term, in some New Relic contexts it will refer to any data type that can be queried via NRQL. For example, when you run a NRQL query, it returns a count of inspected events: this is a count of all data types queried.
First, we’ll explain the definition of logs from a monitoring industry perspective, and then we’ll explain some specifics about how New Relic handles log reporting.
Logs in the monitoring industry
A log is a message about a system used to understand the activity of the system and to diagnose problems.
Logs at New Relic
In New Relic, log data is reported with multiple attributes (key-value data) attached. To query your log data, you could use a NRQL query like:
Select * from Log
To report custom log data, see the Log API.
First, we’ll explain the definition of traces from a monitoring industry perspective, and then we’ll explain some specifics about how New Relic handles tracing.
Tracing in the monitoring industry
In the application/infrastructure-monitoring world, tracing is a general term used to refer to various ways to report information about how a program or system is operating. For example, a stack trace provides in-depth information about a program’s subroutines.
For large modern systems, which are often distributed across many services and micro-services, “tracing” often refers to distributed tracing, which is a way to monitor requests as they propagate through a complex, distributed environment.
Tracing at New Relic
New Relic offers a distributed tracing feature that tracks requests across a distributed system, and provides a dedicated UI for understanding and analyzing your traces. In New Relic, trace data is reported as
Span objects, with multiple attributes (key-value pairs) attached.
To query your tracing data, you could use a NRQL query like:
Select * from Span
To learn more about how distributed tracing works, see Understand distributed tracing.
To report custom distributed tracing data, see the Trace API.
Query and send data
Understanding New Relic data types can help you:
For a simpler explanation of these data types using real-world examples, see Introduction to essential telemetry data types.