Organize data with partitions

Data partitions are a way to group or organize log data for faster and more efficient querying. When a query targets a single partition, our logs UI:

Scans less unrelated data
Returns results faster

Accounts can have multiple partitions, and multiple partitions can be queried at the same time.

Data partitions also allow data to be mapped to an alternative, or “secondary” namespace with a fixed 30-day retention. This is useful for maintaining compliance with privacy-centric regulations and standards like the General Data Protection Regulation (GDPR).

Plan your partition

Before you start creating partitions, make sure you have the required permissions and a plan for how to implement the partitions.

Important

Logs are routed to partitions during the ingestion process, before data is written to NRDB. Partition rules won't affect logs that were ingested before the rule was created.

Sizing and organizing a partition

You can gain significant performance improvements with proper use of data partitions. Organizing your data into discrete partitions enables you to query just the data you need. You can query a single partition or a comma separated list of partitions. The goals of partitioning your data should be:

Create data partitions that align with categories in your environment or organization that are static or change infrequently (for example, by business unit, team, environment, service, etc.).
Create partitions to optimize the number of events that must be scanned for your most common queries. There is no hard and fast rule, but generally as scanned log events gets over 500 million (especially over 1 billion) for your common queries you may consider adjusting your partitioning.

The key drivers of scanned events:

Partition size (number of events)
Default retention for a given partition (impacts the total potential events that can be scanned in a query)
Time window for the NRQL query
Chart and dashboard structure
- Default time window duration
- Number of charts in a dashboard (increases number of queries)

What's the right partition size?

Having more partitions allows for more targeted searches, but creating too many partitions can make logs hard to find and increase administrative overhead. Finding the right balance is important.

We support 100 partitions maximum per account, but the optimal number for most accounts will depend on how organized your partition scheme is and how well you can coordinate different teams and team members in their use of the platform. Although some customers will find managing more than a dozen partitions in an account challenging, we believe that well-organized partitions with logical groupings of data with mnemonic naming conventions can scale well beyond that.

Tips for estimating your partition size

In order to get a sense of how many events are going into a given partition per day:

FROM Log_nginx
SELECT count(*)
SINCE 1 day ago

In addition you can understand the actual query usage and performance against a partition by analyzing the NrDbQuery event. This has a variety of useful attributes including the duration of the query, the actual NRQL statement itself and the time window used in the query.

For example the following query will return a histogram of the time window sizes used in your log queries for a given partition:

FROM NrdbQuery 
SELECT histogram(timeWindowMinutes, 5, 5) 
WHERE query LIKE '%Log_nginx%'

Another useful technique is to zero in on the slowest queries against a given partition and understand the time windows being used on those queries:

FROM NrdbQuery
SELECT percentile(durationMs, 90)
WHERE query LIKE '%Log_nginx%'
FACET query, timeWindowMinutes

Choosing a namespace

A partition's namespace determines its retention period. We offer two retention options:

Standard: The account's default retention determined by your New Relic subscription. This is the maximum retention period available in your account and is the namespace you'll select for most of your partitions.
Secondary: 30-day retention. All logs sent to a partition that's a member of the Secondary namespace will be purged on a rolling basis 30 days after having been ingested.

Secondary retention is not a cost control mechanism. Data is billed on ingest.

Use NerdGraph API to manage data partitions

If you want to manage your data partitions programmatically, you can use NerdGraph API explorer: one.newrelic.com > All capabilities > Apps > NerdGraph API explorer. The NerdGraph data partitions tutorial shows how to query, create, and delete data partitions using this API.

Create partition rules via UI

To the left of the logs query bar, click Data partitions, then create a partition name with the retention namespace, optional description, and matching criteria.

To create a new partition rule:

Go to one.newrelic.com > All capabilities > Logs.
To the left of the logs query bar, click Partition, then click Create new.
Define a Partition name as an alphanumeric string that begins with Log_.
Add an optional description.
Select the retention namespace for the partition.
Set your rule's Matching criteria: Enter a valid NRQL WHERE clause to match the logs to store in this partition.

To view your partitions: click the Partition dropdown.

Search data partitions

The default partition is Log. Any log that isn't affected by a partition rule will be stored in the Log partition by default.

You can query multiple partitions at the same time. For the best performance, select the smallest number of partitions possible.

To search partitions: To the left of the logs query bar, click Partition, and use the partition search bar.

Plan your partition .css-21sua1{background:none;border:none;width:0;padding:0;}