Circuit breaker for Java custom instrumentation

The New Relic Java agent includes a circuit breaker that protects applications from the effects of over-instrumentation. When the circuit breaker detects early symptoms of memory exhaustion, it automatically "trips" and limits instrumentation. The agent stops collecting transaction data until the circuit breaker automatically resets after deciding that resetting is safe.

The circuit breaker takes two parameters into account (heap usage and time spent in garbage collection) to determine when it should trip. The default values for these thresholds are percentages:

memory_threshold: 20
gc_cpu_threshold: 10

When the percentage of free heap memory is less than memory_threshold, and the CPU time spent doing garbage collection is greater than gc_cpu_threshold, the circuit breaker trips. When the circuit breaker trips, the agent stops collecting transaction data. Throughput reported in the New Relic APM UI will be underreported, and you will not see transaction traces for a period of time.

Reasons for memory exhaustion

The circuit breaker trips when it detects signs of memory exhaustion. This can happen for several reasons:

Your application is over-instrumented.

Your application shows early signs of memory exhaustion due to recently deployed custom instrumentation (using XML, API calls, trace annotations, or the Java agent's custom instrumentation editor) or due to built-in instrumentation.

Your application has experienced a load spike.

Your application experienced a load spike and showed signs of memory exhaustion. In this case, the agent isn’t contributing to the spike, but the circuit breaker can help conserve resources and ensure that the agent doesn’t contribute to OutOfMemoryErrors.

Your application runs close to its memory limit by design.

Your application is tuned to run close to its memory limit.

Troubleshooting

If the circuit breaker trips, try these troubleshooting tips.

Identify and disable instrumentation.

Use the Top methods by call count table on the circuit breaker Events page to find methods that might be over-instrumented. Identify and disable custom instrumentation.

In general, agent memory usage is proportional to the call count of a method. Custom instrumentation should be used on methods that are called no more than ten or so times per transaction. If the instrumentation is built into the agent, review New Relic's custom instrumentation documentation for Java. If you need additional help, get support at support.newrelic.com.

Increase maximum Java heap size.

Carefully review your application's memory usage history and determine whether increasing the maximum Java heap size is necessary.

Disable the circuit breaker.

If your application is behaving as expected, you may want to disable the circuit breaker. To disable the circuit breaker, add the following configuration under the common stanza in your newrelic.yml configuration file.

  circuitbreaker:
    enabled: false

These settings are not included in your newrelic.yml by default. This change does not require a JVM restart.

Adjust memory and garbage collection CPU time thresholds.

To detect early signs of memory exhaustion, the circuit breaker uses a formula with two variables: memory_threshold and gc_cpu_threshold. The circuit breaker trips when both values exceed the threshold.

The default values for these two thresholds are percentages:

    memory_threshold: 20
    gc_cpu_threshold: 10

These two thresholds are ratios that represent available memory and percentage of CPU time spent performing garbage collection. Both thresholds range from 0 to 100 and can be adjusted in your newrelic.yml configuration file by adding the following configuration under the common stanza. Use the values that best suit your application.

The newrelic.yml file requires exact spacing. Before circuit breaker there must be two spaces and before memory and gc there must be four.

  circuitbreaker:
    memory_threshold: 10
    gc_cpu_threshold: 20

These settings are not included in your newrelic.yml by default. This change does not require a JVM restart.

In order to make the circuit breaker less likely to trip, decrease memory_threshold and/or increase gc_cpu_threshold. Adjust these values as needed, based on your application's operating performance and behavior.

For more help

Additional documentation resources include:

Join the discussion about Java monitoring in the New Relic Online Technical Community! The Technical Community is a public platform to discuss and troubleshoot your New Relic toolset.

If you need additional help, get support at support.newrelic.com.