We use the data dictionary to provide information about data types (for example, Transaction, Metric, Log) and attached attributes. This is exposed in the New Relic query experience, on hovering over the data types and attributes. And we also expose the dictionary to the public. This doc contains info on how to add data to the dictionary.
Note that currently we don't have infrastructure integration data in the dictionary. We also have very few attributes for the Metric data type, just a few basic default ones, despite there being many potential attributes attached to that data type depending on what the data source is.
The data in the data dictionary is structured like this:
- Data types: sometimes called "events" for historical reasons, these are the NRDB data objects, like Transaction, Metric, Log, Span, etc.
- Attributes: these are key:value pairs attached to data types. One attribute can have multiple data types listed for it. For example, appName is on multiple data types.
- Data source: The New Relic product from which the data originates. With the current implementation, a data type must have a single data source assigned. This isn't ideal, as theoretically a data type can come from multiple sources.
To add a new data type:
- Make a new folder in this directory, alongside the other directories:src/data-dictionary/events
- Duplicate a data type file (for example, this Metric data type), place it in the new folder, and fill it out with the new info.
- For how to add attributes on a new data type, keep reading.
To add attributes:
- Check if there's an existing attribute that has the same name and same general definition as the one you're trying to add.
- If it exists, edit that file to include the new data type (under
- If an existing attribute is fairly close in definition but not exact, try to edit the definition so it works for all associated data types.
- If there is no existing attribute, create a new one.
- The easiest way is to copy an existing attribute file, edit all relevant fields to be accurate, and place it in the folder of the data type it's attached to (for example, Transaction).
- When writing an attribute, try to keep it as concise as possible, and avoid using links. For more on that, see Style.
- If the attribute has one of the units of measurement we use (like
units: percentage (%)), add that to the front matter. For info on units, see Units.
When adding or editing attributes, some things to keep in mind:
- Aim for concise descriptions. Notes on this:
- Feel free to use sentence fragments. Feel free to edit definitions submitted from internal teams as sometimes their definitions will be overly long or have unneeded links.
- Some attributes are only present in some situations (for example, when a specific APM agent is used, or when a specific config option is true). We should avoid documenting most of these, with the idea that most customers will only care about the data because it's being reported for them, and won't care about the things that factor into the reasons why it wouldn't be reported. If it's thought that the addition would help customers who already have the data reporting understand the data better, you can include it.
- Write so it reads well as plain text. Details on this:
- Note that the docs site data dictionary has our usual docs styles available, but the query UI definitions have no styling available. This means that you should write definitions so that they are understandable without any styling, as plain text. You can often avoid any need for styling (for example, data types like Transaction are easy to understand as plain text, as are most phrasings that use attribute names or values. When plain text makes something too ambiguous and you need some styling, use back ticks to indicate attributes or values (for example, a definition like: Reported when `category` is `http`). Using this also sets us up for success if we implement styling in the UI definitions in the future.
- We should avoid links because those aren't visible in the query UI; the query UI only displays plain text. However, every attribute entry in the query UI does display a 'See attribute in docs' link that links to the complete and normally formatted docs site entry. This means that we can use links provided that they will display in an easy to understand way in the query UI. For example, here's an example of an attribute that would read clearly in plain text despite not having a link. We should avoid using a link format like "For more information, see Create an alert condition", because it wouldn't read well in plain text, and would choose something like "For more information, see our alert condition docs."
Attribute entries have an optional unit of measurement field in the front matter. For example, here the front matter for an attribute with a percent unit of measurement:
---name: cpuPercenttype: attributeunits: percentage (%)events:- ProcessSample---
What unit of measurement to select for an attribute is sometimes obvious, like if the attribute value is measured in milliseconds (
units: milliseconds (ms)) or seconds (
units: seconds (s)) or a percentage (
units: percentage (%)).
We also have several units of measurement that are not obvious, like
ID. (Technically, these aren't actually "units of measurement" and are more just conceptual data types but we're doing it this way as a workaround so you can ignore the fact that units of measurement isn't actually accurate.)
The main reason we want to specify this information is that this will control what kinds of queries or charts can be created or auto-suggested by New Relic. For example, the New Relic UI wouldn't want to auto-suggest a chart graphing the average of ID values because that wouldn't make any sense. So attributes with accurate units, such as data types, will help product provide more practical help/suggestions to customers in future.
Here are some tips for the non-obvious unit types:
- count: This is a count of something, though not a count of time-based units. For a number to be a
count, it must (a) only be capable of increasing during a given time/sampling period, and (b) have a theoretically uncapped range. This wouldn't be used for a count of time units; if it was a count of seconds, for example, you would just use 'seconds' as the unit of measurement. A couple of examples of a count:
enumis short for enumerated list. In other words, it is a specific range of numbers that represent other non-numeric elements. For example, an attribute that had HTTP error codes (404, 505, etc.) as possible values would be an
enum. A range of numbers that represent color codes would be another example of an
enum. (Theoretically, an
enumcan represent lists without numeric values but we have no need to categorize strings so we only care about numeric-value lists.) Example: httpResponsecode.
- rate: Use this for any
rate(for example, count per second). These are typically for averaged rates over small units of time, like second or millisecond. We previously have used the unit of time for these attributes (for example, using seconds as the unit of measurement for a count per second rate), but now we want to use
ratefor these. This is necessary because the types of displays used for rates would be different than the types of displays used for a simpler duration/count measurement. Example: MySQL integration attribute
db.innodb.dataReadBytesPerSecond, which has the definition "Rate at which data is read from InnoDB tables in bytes per second."
- id: Use
IDfor any identification number attribute. Example: appId.