Varnish Cache monitoring integration

The Varnish Cache integration reports data from your Varnish environment to New Relic Infrastructure. This document explains how to install and configure the Varnish Cache integration and describes the data collected.

Access to this feature depends on your subscription level. Requires Infrastructure Pro.

Features

The Varnish Cache on-host integration collects and sends inventory and metrics from your Varnish Cache environment to New Relic Infrastructure so you can monitor its health. Metric data is collected at the instance, lock, memory pool, storage, and backend levels.

Compatibility and requirements

To use the Varnish Cache integration, ensure your system meets these requirements:

Install

On-host integrations do not automatically update. For best results, you should occasionally update the integration and update the Infrastructure agent.

To install the Varnish Cache integration:

  1. Follow the instructions for installing an integration, using the file name nri-varnish.
  2. Via the command line, change directory to the integrations folder:
    cd /etc/newrelic-infra/integrations.d
  3. Create a copy of the sample configuration file by running:
    sudo cp varnish-config.yml.sample varnish-config.yml
  4. Edit the varnish-config.yml configuration file using the configuration settings.
  5. Restart the infrastructure agent.

Configure

Use the Varnish Cache integration's varnish-config.yml configuration file to put required login credentials and configure how data is collected. For an example configuration, see the example config file.

Commands

The varnish-config.yml file provides one command:

  • all_data: collects both inventory and metrics for the Varnish Cache environment.

Arguments

The varnish-config.yml commands accept the following arguments:

  • params_config_file: The location of the varnish.params configuration file. If omitted, it will look in /etc/default/varnish/varnish.params and /etc/sysconfig/varnish/varnish.params.

  • instance_name: User defined name to identify data from this instance in New Relic. Required.

Example varnish-config.yml file configuration:

Example configuration
integration_name: com.newrelic.varnish

instances:
  - name: varnish_all
    command: all_data
    arguments:
      params_config_file:  /etc/varnish/varnish.params
      instance_name: varnish-0.localnet
    labels:
      env: production
      role: varnish

Find and use data

To find your integration data in Infrastructure, go to infrastructure.newrelic.com > Integrations > On-host integrations and select one of the Varnish Cache integration links.

In New Relic Insights, Varnish Cache data is attached to the following Insights event type:

For more on how to find and use your data, see Understand integration data.

Metric data

The Varnish Cache integration collects the following metric data attributes. Each metric name is prefixed with a category indicator and a period, such as bans. or main..

These attributes can be found by querying the VarnishSample event types in Insights.

Metric Description

backend.connectionBusy

Number of times the maximum connection has been reached.

backend.connectionFails

Number of failed connections to the backed.

backend.connectionRecycles

Number of backend connections that have been recycled.

backend.connectionRetries

Number of backend connections that have been retried.

backend.connectionReuses

Number of backend connections reuses.

backend.connectionSuccess

Number of successful backend connections,

backend.connectionUnHealthy

Number of backend connections that were not attempted due to ‘unhealthy’ backend status.

backend.fetches

Total number of backend fetches initiated.

backend.requests

Total number of backend connection requests made.

bans.added

Counter of bans added to ban list.

bans.completed

Number of bans marked ‘completed'.

bans.cutoffLurkerKilled

Number of objects killed by bans for cutoff (lurker).

bans.deleted

Counter of bans deleted from ban list.

bans.dups

Count of bans replaced by later identical bans.

bans.fragmentationInBytes

Extra bytes in persisted ban lists due to fragmentation.

bans.lookupKilled

Number of objects killed by bans during object lookup.

bans.lookupTestsTested

Count of how many tests and objects have been tested against each other during lookup.

bans.lurkerCon

Number of times the ban-lurker had to wait for lookups.

bans.lurkerKilled

Number of objects killed by the ban-lurker.

bans.lurkerTested

Count of how many bans and objects have been tested against each other by the ban-lurker.

bans.lurkerTestsTested

Count of how many tests and objects have been tested against each other during by the ban-lurker.

bans.obj

Number of bans using obj.* variables. These bans can possibly be washed by the ban-lurker.

bans.persistedInBytes

Bytes used by the persisted ban lists.

bans.req

Number of bans which use req.* variables. These bans can not be washed by the ban-lurker.

bans.tested

Count of how many bans and objects have been tested against each other during hash lookup.

cache.graceHits

Count of cache hits with grace. A cache hit with grace is a cache hit where the object is expired. These hits also included in the cache_hit counter.

cache.hits

Number of times an object has been delivered to a client without fetching it from a backend server.

cache.misses

Number of times the object was fetched from the backend before delivering it to the client.

cache.missHits

Number of times a hit object was returned for a miss response.

cache.passHits

Number of times a hit object was returned for a pass response.

esi.errors

Edge Side Includes (ESI) parsing errors (unlock).

esi.warnings

Edge Side Includes (ESI) parse warnings (unlock).

fetch.bad

The beresp.body length/fetch could not be determined.

fetch.chuncked

The beresp.body chunked.

fetch.contentLength

The beresp.body with content-length.

fetch.eof

The beresp.body with EOF.

fetch.failed

The beresp failed.

fetch.head

The beresp with no body because the request is HEAD.

fetch.noBody

The beresp with no body.

fetch.noBody1xx

The beresp with no body because of 1XX response.

fetch.noBody204

The beresp with no body because of 204 response.

fetch.noBody304

The beresp with no body because of 304 response.

fetch.noThreadFail

The beresp fetch failed, no thread available.

hcb.inserts

Number of critical bit tree-based hash (HCB) inserts.

hcb.lock

Number of HCB lookups with lock.

hcb.noLock

Number of HCB lookups without lock.

lru.limited

Number of times more storage space was needed, but limit was reached.

lru.moved

Number of move operations done on the LRU list.

lru.nuked

Number of least recently used (LRU) objects forcefully evicted from storage to make room for a new object.

main.backends

Number of backends.

main.bans

Count of bans.

main.busyKilled

Number of requests killed after sleep on busy objhdr.

main.busySleep

Number of requests sent to sleep on busy objhdr.

main.busyWakeup

Number of requests woken after sleep on busy objhdr.

main.expired

Number of expired objects.

main.expiredMailed

Number of objects mailed to expiry thread.

main.expiredReceived

Number of objects received by expiry thread.

main.gunzip

Number of gunzip operations.

main.gunzipTest

Number of test gunzip operations.

main.gzip

Number of gzip operations.

main.objectcores

Number of objectcore structs made.

main.objectheads

Number of objected structs made.

main.objects

Number of object structs made.

main.passedRequests

Total pass-ed requests seen.

main.pipeSessions

Total pipe sessions seen.

main.pools

Number of thread pools.

main.purgeObjects

Number of purged objects.

main.purgeOperations

Number of purge operations executed.

main.reqDropped

Number of requests dropped.

main.sessions

Total number of sessions seen.

main.sessQueueLength

Length of session queue waiting for threads.

main.summs

Number of times per-thread statistics were summed into the global counters.

main.syntheticResponses

Total synthethic responses made.

main.threads

Total number of threads.

main.threadsCreated

Total number of threads created in all pools.

main.threadsDestroyed

Total number of threads destroyed in all pools.

main.threadsFailed

Number of times creating a thread failed.

main.threadsLimited

Number of times more threads were needed, but limit was reached in a thread pool.

main.unresurrectedObjects

Number of unresurrected objects.

main.uptimeInMilliseconds

The child process uptime, in milliseconds.

main.vclAvailable

Number of Varnish Configuration Languages (VCL) available.

main.vclDiscarded

Number of discarded VCLs.

main.vclFails

Number of VCL failures.

main.vclLoaded

Number of loaded VCLs in total.

main.vmodsLoaded

Number of loaded Varnish modules (VMOD).

mgt.childDied

Number of times the child process has died due to signals.

mgt.childDump

Number of times the child process has produced core dumps.

mgt.childExit

Number of times the child process has been cleanly stopped.

mgt.childPanic

Number of times the management process has caught a child panic.

mgt.childStart

Number of times the child process has been started.

mgt.childStop

Number of times the child process has been cleanly stopped.

mgt.uptimeInMilliseconds

The management process uptime, in milliseconds.

net.400Errors

Number of client requests received, subject to 400 errors.

net.417Errors

Number of client requests received, subject to 417 errors

net.httpOverflow

Number of HTTP header overflows.

net.pipe.inInBytes

Total number of bytes forwarded from clients in pipe sessions.

net.pipe.outInBytes

Total number of bytes forwarded to clients in pipe sessions.

net.pipereq.headerInBytes

Total request bytes received for piped sessions.

net.request.bodyInBytes

Total request body transmitted, in bytes.

net.request.headerInBytes

Total request headers transmitted, in bytes.

net.requests

Number of good client requests received.

net.response.bodyInBytes

Total response body transmitted, in bytes.

net.response.headerInBytes

Total response headers transmitted, in bytes.

sess.backendClose

Number of session closes with the error RESP_CLOSE, (Backend/VCL requested close).

sess.badClose

Number of session closes with the error Error RX_BAD, (Received bad req/resp).

sess.bodyFailClose

Number of session closes with the error Error RX_BODY, (Failure receiving req.body).

sess.clientClose

Number of session closes with the error REM_CLOSE, (Client closed).

sess.clientReqClose

Number of session closes with the error REQ_CLOSE, (Client requested close).

sess.closed

Total number of sessions closed.

sess.closedError

Total number of sessions closed with errors.

sess.dropped

Number of sessions dropped for thread.

sess.eofTxnClose

Number of session closes with the error TX_EOF, (EOF transmission).

sess.errorTxnClose

Number of session closes with the error TX_ERROR, (Error transaction).

sess.herd

Number of times the timeout_linger triggered.

sess.junkClose

Number of session closes with the error RX_JUNK, (Received junk data).

sess.overflowClose

Number of session closes with the error RX_OVERFLOW, (Received buffer overflow).

sess.overloadClose

Number of session closes with the error OVERLOAD, (Out of some resource).

sess.pipeOverflowClose

Number of session closes with the error PIPE_OVERFLOW, (Session pipe overflow).

sess.pipeTxnClose

Number of session closes with the error TX_PIPE, (Piped transaction).

sess.queued

Number of sessions queued for thread.

sess.readAhead

Session Read Ahead.

sess.requestHTTP10Close

Number of session closes with the error REQ_HTTP10, (Proto < HTTP/1.1).

sess.requestHTTP20Close

Number of session closes with the error REQ_HTTP20, (HTTP2 not accepted).

sess.shortRangeClose

Number of session closes with the error RANGE_SHORT, (Insufficient data for range).

sess.timeoutClose

Number of session closes with the error RX_TIMEOUT, (Receive timeout).

sess.vclFailClose

Number of session closes with the error VCL_FAILURE, (VCL failure).

session.connections

Count of sessions successfully accepted.

session.drops

Count of sessions silently dropped due to lack of worker thread.

session.fail

Count of failures to accept TCP connection.

shm.contentions

Number of shared memory (SHM) MTX contentions.

shm.cycles

Number of SHM cycles through buffer.

shm.flushes

Number of SHM flushes due to overflow.

shm.records

Number of SHM records.

shm.writes

Number of SHM writes.

workspace.backendOverflow

Number of times we ran out of space in workspace_backend.

workspace.clientOverflow

Number of times we ran out of space in workspace_client.

workspace.deliveryFail

Delivery failed due to insufficient workspace.

workspace.sessionOverflow

Number of times we ran out of space in workspace_session.

workspace.threadOverflow

Number of times we ran out of space in workspace_thread.

These attributes can be found by querying the VarnishLockSample event types in Insights.

Metric Description

lock.created

Count of created locks.

lock.destroyed

Count of destroyed locks.

lock.locks

Count of lock operations.

These attributes can be found by querying the VarnishStorageSample event types in Insights.

Metric Description

storage.allocFails

Number of times the storage has failed to provide a storage segment.

storage.allocInBytes

Number of total bytes allocated by this storage.

storage.allocOustanding

Number of storage allocations outstanding.

storage.allocReqs

Number of times the storage has been asked to provide a storage segment.

storage.availableInBytes

Number of bytes left in the storage.

storage.freeInBytes

Number of total bytes returned to this storage.

storage.outstandingInBytes

Number of bytes allocated from the storage.

These attributes can be found by querying the VarnishMempoolSample event types in Insights.

Metric Description

mempool.allocatedSizeInBytes

Allocated size of memory pool, in bytes.

mempool.allocs

Memory pool allocations.

mempool.frees

Number of memory pools free.

mempool.live

Number of memory pools in use.

mempool.pool

Count in memory pool.

mempool.ranDry

Pool ran dry.

mempool.recycles

Recycled from pool.

mempool.requestSizeInBytes

Request size of memory pool, in bytes.

mempool.surplus

Too many for pool.

mempool.timeouts

Timed out from pool.

mempool.tooSmall

Too small to recycle.

These attributes can be found by querying the VarnishBackendSample event types in Insights.

Metric Description

backend.busyFetches

Fetches not attempted due to backend being busy.

backend.connections

Number of concurrent connections to the backend.

backend.connectionsFailed

Number of backend connections failed.

backend.connectionsNotAttempted

Number of backend connection opens not attempted.

backend.happy

Happy health probes.

backend.unhealtyFetches

Fetches not attempted due to backend being unhealthy

net.backend.pipeHeaderInBytes

Total request bytes sent for piped sessions.

net.backend.pipeInInBytes

Total number of bytes forwarded from backend in pipe sessions.

net.backend.pipeOutInBytes

Total number of bytes forwarded to backend in pipe sessions.

net.backend.requestBodyInBytes

Total backend request body bytes sent.

net.backend.requestHeaderInBytes

Total backend request header bytes sent.

net.backend.requests

Number of backend requests sent,

net.backend.responseBodyInBytes

Total backend response body bytes received.

net.backend.responseHeaderInBytes

Total backend response header bytes received.

Inventory data

The Varnish Cache integration captures the configuration parameters. It parses the varnish.params configuration file for all parameters that are active.

The data is available on the Infrastructure Inventory page, under the config/varnish source. For more about inventory data, see Understand integration data.

For more help

Recommendations for learning more: