Alert on Infrastructure processes

Use New Relic Infrastructure's Process running alert condition to be notified when a set of processes on your filtered hosts stop running for a configurable number of minutes. This is useful, for example, when:

  • Any of the processes on the hosts stop reporting to New Relic
  • A process you expected to start on a host (such as a new program) is not actually running

This feature's flexibility allows you to easily filter what hosts and processes to monitor and when to notify selected individuals or teams. In addition, the email notification includes links to help you quickly troubleshoot the situation.

If you apply host filters from a New Relic Infrastructure page and then select the bell [bell icon] icon to create an alert condition, the Settings > Alerts page automatically applies your current host filter.

Examples

By applying filters to the hosts and processes that are important to your business, you can define alerting thresholds to decide when the alerting event triggers and New Relic sends an email notification to you. These examples illustrate how to use Infrastructure's Process running condition to monitor your processes.

Ensuring enough processes are running to satisfy load

Problem: Some load balancers and application servers work by running many worker processes in parallel. Here, for example, you may want an alert notification when fewer than eight processes are running for a service like gunicorn.

Solution: Depending on the situation, use any of these Process running thresholds options as needed:

  • More than the defined number of processes are running
  • Exactly the defined number of processes are running
  • Fewer than the defined number of processes are running
Ensuring that critical services run constantly

Problem: A service, such as a database or application server, is expected to be running constantly on certain hosts, and you need to know when it has stopped.

Solution: Use the No processes are running (default) threshold.

Monitoring startup for critical processes that require special attention

Problem: You have processes requiring special attention due to security or potential performance impact.

Solution: Use the At least one process is running threshold with condition filters set to a username and specific executable so that Infrastructure can notify you when the process is running.

Making sure a job doesn't take too long

Problem: You have a job that runs periodically, and you want to be notified when it has been running longer than an expected number of minutes.

Solution: Use the At least one process is running threshold.

Validating that services started successfully

Problem: When provisioning new hosts, you want to be alerted if a required service has successfully started up.

Solution: Use the No processes are running (default) threshold.

Watching for runaway processes or configuration problems

Problem: Sometimes problems with processes can be solved with changes to your configuration. For example, you have more than one Chef process running, and you may need to address an issue with how that service is configured.

Solution: Depending on the situation, use any of these Process running thresholds options as needed:

  • More than the defined number of processes are running
  • Exactly the defined number of processes are running
  • Fewer than the defined number of processes are running

Create Infrastructure process running condition

New Relic Infrastructure: Process running alert
infrastructure.newrelic.com > Settings > Alerts: The Process running alert condition allows you to filter by host and type of process. After selecting a target process, choose whether to alert on too many processes, too few processes, or the existence of the process, as well as the number of minutes to wait before triggering the alert notification.

Owner or Admins

To define the Process running alert criteria:

  1. Follow standard procedures to create a New Relic Infrastructure alert condition.
  2. Select Process running as the Alert type.
  3. Filter what hosts and processes you want the alert condition to apply to.
  4. Define the Critical threshold for triggering the alert notification: minimum 1 minute, default 5 minutes, maximum 60 minutes.

If you create the alert condition directly with Infrastructure, New Relic will send an email notification when the defined threshold for the alert condition passes. Your alert policy defines which personnel or teams and which notification channels New Relic uses.

For more help

Recommendations for learning more: