CPU and memory limits and requests can vary according to the number of targets monitored, and the number of metrics exposed by each target. For example, a Prometheus OpenMetrics integration which scrapes
800 targets, exposing
1000 timeseries each, with a latency of
150ms and a
scrape_duration of 30 seconds, consumes
700MB of RAM.
To estimate the size of the environment you are monitoring, run the following query to see how many targets are being scraped:
SELECT latest(nr_stats_targets) FROM Metric where clusterName=’clusterName’ SINCE 30 MINUTES AGO TIMESERIES
In huge environments with hundreds of targets to be scraped, the latency on the
/metrics endpoints must be below 1 second. Run this query to check the latency of the different targets. This query retrieves the data exposed by the Prometheus OpenMetrics integration, and shows the time required to fetch each endpoint.
SELECT average(nr_stats_integration_fetch_target_duration_seconds) FROM Metric where clusterName=’clustername' SINCE 30 MINUTES AGO FACET target LIMIT 30
In order to keep the time needed to scrape all the targets below 30 seconds, use the following configurations:
Targets < 400, with 1000 metrics each
No modification is required. CPU ranges roughly between
400 < targets < 1000, with 1000 metrics each
The number of workers should be increased to
Targets > 1000, with 1000 metrics each
The number of workers should be increased to 10 or more. CPU is over
If you need more help, check out these support and learning resources:
- Browse the Explorers Hub to get help from the community and join in discussions.
- Find answers on our sites and learn how to use our support portal.
- Run New Relic Diagnostics, our troubleshooting tool for Linux, Windows, and macOS.
- Review New Relic's data security and licenses documentation.