/

English Español Français 日本語 한국어 Português

Log in Start now

Google Cloud Dataflow monitoring integration

New Relic's integrations include an integration for reporting your GCP Dataflow data to our products. Here we explain how to activate the integration and what data it collects.

Activate integration

To enable the integration follow standard procedures to connect your GCP service to New Relic.

Configuration and polling

You can change the polling frequency and filter data using configuration options.

Default polling information for the GCP Dataflow integration:

New Relic polling interval: 5 minutes

Find and use data

To find your integration data, go to one.newrelic.com > All capabilities > Infrastructure > GCP and select an integration.

Data is attached to the following event type:

Entity	Event Type	Provider
Job	`GcpDataflowJobSample`	`GcpDataflowJob`

For more on how to use your data, see Understand and use integration data.

Metric data

This integration collects GCP Dataflow data for Job.

Dataflow Job data

Metric	Unit	Description
`job.BillableShuffleDataProcessed`	Bytes	The billable bytes of shuffle data processed by this Dataflow job.
`job.CurrentNumVcpus`	Count	The number of vCPUs currently being used by this Dataflow job. This is the current number of workers times the number of vCPUs per worker.
`job.CurrentShuffleSlots`	Count	The current shuffle slots used by this Dataflow job.
`job.DataWatermarkAge`	Seconds	The age (time since event timestamp) up to which all data has been processed by the pipeline.
`job.ElapsedTime`	Seconds	Duration that the current run of this pipeline has been in the Running state so far, in seconds. When a run completes, this stays at the duration of that run until the next run starts.
`job.Elements`	Count	Number of elements added to the pcollection so far.
`job.EstimatedBytes`	Bytes	An estimated number of bytes added to the pcollection so far. Dataflow calculates the average encoded size of elements in a pcollection and mutiplies it by the number of elements.
`job.IsFailed`	Count	Has this job failed.
`job.PerStageDataWatermarkAge`	Seconds	The age (time since event timestamp) up to which all data has been processed by this stage of the pipeline.
`job.PerStageSystemLag`	Seconds	The current maximum duration that an item of data has been processing or awaiting processing in seconds, per pipeline stage.
`job.SystemLag`	Seconds	The current maximum duration that an item of data has been processing or awaiting processing, in seconds.
`job.TotalMemoryUsageTime`	Other	The total GB seconds of memory allocated to this Dataflow job.
`job.TotalPdUsageTime`	Other	The total GB seconds for all persistent disk used by all workers associated with this Dataflow job.
`job.TotalShuffleDataProcessed`	Bytes	The total bytes of shuffle data processed by this Dataflow job.
`job.TotalStreamingDataProcessed`	Bytes	The total bytes of streaming data processed by this Dataflow job.
`job.TotalVcpuTime`	Seconds	The total vCPU seconds used by this Dataflow job.
`job.UserCounter`	Count	A user-defined counter metric.