Google Cloud Dataflow monitoring integration

New Relic's integrations include an integration for reporting your GCP Dataflow data to our products. Here we explain how to activate the integration and what data it collects.

Activate integration

To enable the integration follow standard procedures to connect your GCP service to New Relic.

Configuration and polling

You can change the polling frequency and filter data using configuration options.

Default polling information for the GCP Dataflow integration:

  • New Relic polling interval: 5 minutes

Find and use data

To find your integration data, go to one.newrelic.com > Infrastructure > GCP and select an integration.

Data is attached to the following event type:

Entity Event Type Provider
Job GcpDataflowJobSample GcpDataflowJob

For more on how to use your data, see Understand and use integration data.

Metric data

This integration collects GCP Dataflow data for Job.

Dataflow Job data

Metric Unit Description

job.BillableShuffleDataProcessed

Bytes The billable bytes of shuffle data processed by this Dataflow job.

job.CurrentNumVcpus

Count The number of vCPUs currently being used by this Dataflow job. This is the current number of workers times the number of vCPUs per worker.

job.CurrentShuffleSlots

Count The current shuffle slots used by this Dataflow job.

job.DataWatermarkAge

Seconds The age (time since event timestamp) up to which all data has been processed by the pipeline.

job.ElapsedTime

Seconds Duration that the current run of this pipeline has been in the Running state so far, in seconds. When a run completes, this stays at the duration of that run until the next run starts.

job.Elements

Count Number of elements added to the pcollection so far.

job.EstimatedBytes

Bytes An estimated number of bytes added to the pcollection so far. Dataflow calculates the average encoded size of elements in a pcollection and mutiplies it by the number of elements.

job.IsFailed

Count Has this job failed.

job.PerStageDataWatermarkAge

Seconds The age (time since event timestamp) up to which all data has been processed by this stage of the pipeline.

job.PerStageSystemLag

Seconds The current maximum duration that an item of data has been processing or awaiting processing in seconds, per pipeline stage.

job.SystemLag

Seconds The current maximum duration that an item of data has been processing or awaiting processing, in seconds.

job.TotalMemoryUsageTime

Other The total GB seconds of memory allocated to this Dataflow job.

job.TotalPdUsageTime

Other The total GB seconds for all persistent disk used by all workers associated with this Dataflow job.

job.TotalShuffleDataProcessed

Bytes The total bytes of shuffle data processed by this Dataflow job.

job.TotalStreamingDataProcessed

Bytes The total bytes of streaming data processed by this Dataflow job.

job.TotalVcpuTime

Seconds The total vCPU seconds used by this Dataflow job.

job.UserCounter

Count A user-defined counter metric.

For more help

If you need more help, check out these support and learning resources: