This resource contains information about common errors and error messages that may alert you to issues with data visibility and availability, as well as information about how to respond.
Common errors and issues
If you receive an integration error message from New Relic or error messages in your Prometheus server logs after restarting your Prometheus server, there are several actions you can take to troubleshoot and get data flowing properly. Below are a few tips regarding common issues and error messages. For specific information on how to query for
NrIntegrationError events, see Investigate error messages below.
- Configuration errors
Missing or incorrect characters in the remote write URL in the config file (for example the endpoint, license key, or
prometheus_servername) or incorrect placement of the information in the file will result in the Prometheus server not starting, remote write not working properly, or errors appearing in Prometheus server logs.
- 400: bad request error
If no data appears with a bad request error, check your configuration file to confirm that the placement of the remote write information is correct, and that there are no missing or incorrect characters.
- 413: request entity too large error
This means you have sent a request in which one or more fields, or the entire payload, has exceeded our limits.
- 429: rate limit error
This means you have hit a rate limit on the amount of data being sent at one time (for example cardinality or data points per minute). You can troubleshoot by reducing the amount of Prometheus or general metric data you are sending, or by requesting a rate-limit increase.
Investigate error messages
You can investigate error messages in New Relic by doing either or both of the following.
- Run a
NrIntegrationErrorquery on the error message using NRQL, then look at the Message field in the UI to see a description of what went wrong. For example:
SELECT * FROM NrIntegrationError WHERE newRelicFeature = 'Metrics'
- Investigate individual errors in time to see when and where they occur and any simultaneously occurring issues, and perform targeted troubleshooting based on what you find out. For example:
SELECT count(*) FROM NrIntegrationError WHERE newRelicFeature = 'Metrics' TIMESERIES
If you’ve validated that you can send data successfully but are unable to query it, you may be running into other kinds of limits, like the inspected count limit. This may manifest itself as an error message during the integration process that says:
Unable to retrieve data for Prometheus data source <name>.