• EnglishEspañol日本語한국어Português
  • Log inStart now

Google Cloud Dataproc monitoring integration

New Relic's integrations include an integration for reporting your GCP Dataproc data to our products. Here we explain how to activate the integration and what data it collects.

Activate integration

To enable the integration follow standard procedures to connect your GCP service to New Relic.

Configuration and polling

You can change the polling frequency and filter data using configuration options.

Default polling information for the GCP Dataproc integration:

  • New Relic polling interval: 5 minutes

Find and use data

To find your integration data, go to one.newrelic.com > All capabilities > Infrastructure > GCP and select an integration.

Data is attached to the following event type:

Entity

Event Type

Provider

Cluster

GcpDataprocClusterSample

GcpDataprocCluster

For more on how to use your data, see Understand and use integration data.

Metric data

This integration collects GCP Dataproc data for Cluster.

Dataproc Cluster data

Metric

Unit

Description

cluster.hdfs.Datanodes

Count

Indicates the number of HDFS DataNodes that are running inside a cluster.

cluster.hdfs.StorageCapacity

Gibibytes

Indicates capacity of HDFS system running on cluster in GB.

cluster.hdfs.StorageUtilization

Percent

The percentage of HDFS storage currently used.

cluster.hdfs.UnhealthyBlocks

Count

Indicates the number of unhealthy blocks inside the cluster.

cluster.job.CompletionTime

Seconds

The time jobs took to complete from the time the user submits a job to the time Dataproc reports it is completed.

cluster.job.Duration

Seconds

The time jobs have spent in a given state.

cluster.job.Failures

Count

Indicates the number of jobs that have failed on a cluster.

cluster.job.Running

Count

Indicates the number of jobs that are running on a cluster.

cluster.job.Submitted

Count

Indicates the number of jobs that have been submitted to a cluster.

cluster.operation.CompletionTime

Seconds

The time operations took to complete from the time the user submits a operation to the time Dataproc reports it is completed.

cluster.operation.Duration

Seconds

The time operations have spent in a given state.

cluster.operation.Failures

Count

Indicates the number of operations that have failed on a cluster.

cluster.operation.Running

Count

Indicates the number of operations that are running on a cluster.

cluster.operation.Submitted

Count

Indicates the number of operations that have been submitted to a cluster.

cluster.yarn.AllocatedMemoryPercentage

Percent

The percentage of YARN memory is allocated.

cluster.yarn.Apps

Count

Indicates the number of active YARN applications.

cluster.yarn.Containers

Count

Indicates the number of YARN containers.

cluster.yarn.MemorySize

Gibibytes

Indicates the YARN memory size in GB.

cluster.yarn.Nodemanagers

Count

Indicates the number of YARN NodeManagers running inside cluster.

cluster.yarn.PendingMemorySize

Gibibytes

The current memory request, in GB, that is pending to be fulfilled by the scheduler.

cluster.yarn.VirtualCores

Count

Indicates the number of virtual cores in YARN.

Copyright © 2024 New Relic Inc.

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.