When using synthetic monitoring's private locations with New Relic's alerts, you can be notified if a location is under-provisioned, mis-configured, or generally misbehaving.
- Are my private minions online?
- Does my private location need more minions?
- Can I check the status of a specific minion directly?
Before following the instructions in this guide, ensure you have:
- A synthetic private location
- At least one private minion installed at that location
- Checks scheduled to run at that location
- An alert policy for the private location, with a configured notification channel to notify your team when a violation occurs
The Private Minion dashboard example can be imported to your account using the Dashboard API with the following JSON:
To answer this question, you can rely on attributes from the
SyntheticsPrivateMinionevent. Private minions send this event to New Relic every 30 seconds. A simple way to check if your minions are online is to compare the unique count of minion IDs with the number of minions you expect to be online.
To understand how many minions are reporting, run this example NRQL query:
SELECT uniqueCount(minionId) FROM SyntheticsPrivateMinion WHERE minionLocation = '1-acme_okc_dc-309'
Using this query, you can create an alert condition to notify your team when fewer minions are reporting than expected. This condition is configured with a static threshold of
2 units, which means you will receive an alert if any of your minions are offline.
You can verify that the alert policy works as expected by manually stopping one of your minions. Then, when the alert violation occurs, you will be notified by any notification channels that have been set up. Once the minion is restarted and it comes back online, the alert will recover.
There are more robust ways to check whether minions are functioning correctly, but this query and condition simply and successfully handle the case where a machine fails, is accidentally decommissioned, or the minion process crashes. It also ensures that the minion can communicate with New Relic.
To answer this question, you can use the
checksPending attribute of the
SyntheticsPrivateLocationStatus event. The
checksPending attribute reflects the number of monitor checks that are scheduled (or "queued") but have yet to be accepted by a minion in the designated location. For a location with scheduled checks and no minions, this graph would grow linearly up and to the right.
This metric is more complicated to monitor than
uniqueCount(minionId) because a high value does not necessarily mean the location is in a bad state. As long as the metric is not growing linearly up and to the right (and checks are being run on schedule), the location is in a good state.
This use case is perfect for baseline NRQL alert conditions, which allow you to monitor the deviation of a metric rather than its static value. For example:
SELECT average(checksPending) FROM SyntheticsPrivateLocationStatus WHERE name = '1-acme_tokyo_dc-512'
To test this alert condition, schedule one-minute, browser-based monitors to run from your location. Browser-based jobs consume more resources than ping jobs, which is why they are a better fit for load simulation. New Relic will quickly notify you of a growing number of pending checks.
After doubling the number of minions to handle the load, the alert recovers. For example, using the Synthetics private location dashboard example, notice the growth and decline of pending checks over the course of the incident and recovery. By using the NRQL condition, New Relic will notify you if and when the location needs more minion capacity.
You can also check how a minion is operating by contacting it directly. You can use a set of HTTP endpoints exposed by the minion to determine what the application is doing. In order to access these endpoints, bind ports
8180 to ports on the host. For example, for Docker, use
docker run -p 80:8080 -p 81:8180 ...):
:8080/status/check: Details about internal health-checks the minion performs; HTTP 200 means "healthy."
:8080/status: Details about a minion's status; the same data is then published to Insights as a
:8180/: JVM application admin endpoints; an advanced view of a minion's internal state.
This approach is not as automated or flexible as the
checksPending example. However, if you have total network connectivity failure, this manual approach can help troubleshoot the situation.
If you need more help, check out these support and learning resources:
- Browse the Explorers Hub to get help from the community and join in discussions.
- Find answers on our sites and learn how to use our support portal.
- Run New Relic Diagnostics, our troubleshooting tool for Linux, Windows, and macOS.
- Review New Relic's data security and licenses documentation.