AWS Glue monitoring integration

BETA

Access to this feature depends on your subscription level. Requires Infrastructure Pro.

New Relic Infrastructure's integrations include an integration for reporting your AWS Glue data to New Relic products. This document explains how to activate this integration and describes the data that can be reported.

Activate integration

To enable this integration follow standard procedures to Connect AWS services to Infrastructure.

Configuration and polling

You can change the polling frequency and filter data using configuration options.

Default polling information for the AWS Glue integration:

  • New Relic polling interval: 5 minutes
  • Amazon CloudWatch data interval: 1 minute

Find and use data

To find your integration data in Infrastructure, go to infrastructure.newrelic.com > AWS and select an integration.

In New Relic Insights, data is attached to the following event type:

Entity Event Type Provider
Job AwsGlueJobSample AwsGlueJob

For more on how to use your data, see Understand and use integration data.

Metric data

This integration collects AWS Glue data for Job.

Glue Job data

Metric Unit Description

glue.driver.aggregate.bytesRead

Bytes The number of bytes read from all data sources by all completed Spark tasks running in all executors.

glue.driver.aggregate.elapsedTime

Milliseconds The ETL elapsed time in milliseconds (does not include the job bootstrap times).

glue.driver.aggregate.numCompletedStages

Count The number of completed stages in the job.

glue.driver.aggregate.numCompletedTasks

Count The number of completed tasks in the job.

glue.driver.aggregate.numFailedTasks

Count The number of failed tasks.

glue.driver.aggregate.numKilledTasks

Count The number of tasks killed.

glue.driver.aggregate.recordsRead

Count The number of records read from all data sources by all completed Spark tasks running in all executors.

glue.driver.aggregate.shuffleBytesWritten

Bytes The number of bytes written by all executors to shuffle data between them since the previous report (aggregated by the AWS Glue Metrics Dashboard as the number of bytes written for this purpose during the previous minute).

glue.driver.aggregate.shuffleLocalBytesRead

Bytes The number of bytes read by all executors to shuffle data between them since the previous report (aggregated by the AWS Glue Metrics Dashboard as the number of bytes read for this purpose during the previous minute).

glue.driver.BlockManager.disk.diskSpaceUsed_MB

Bytes The number of megabytes of disk space used across all executors.

glue.driver.ExecutorAllocationManager.executors.numberAllExecutors

Count The number of actively running job executors.

glue.driver.ExecutorAllocationManager.executors.numberMaxNeededExecutors

Count The number of maximum (actively running and pending) job executors needed to satisfy the current load.

glue.driver.jvm.heap.usage

Percent The fraction of memory used by the JVM heap for this driver (scale: 0-1) for driver.

glue.ALL.jvm.heap.usage

Percent The fraction of memory used by the JVM heap for this driver (scale: 0-1) for ALL executors.

glue.driver.jvm.heap.used

Bytes The number of memory bytes used by the JVM heap for the driver.

glue.ALL.jvm.heap.used

Bytes The number of memory bytes used by the JVM heap for ALL executors.

glue.driver.s3.filesystem.read_bytes

Bytes The number of bytes read from Amazon S3 by the driver since the previous report (aggregated by the AWS Glue Metrics Dashboard as the number of bytes read during the previous minute).

glue.ALL.s3.filesystem.read_bytes

Bytes The number of bytes read from Amazon S3 by ALL executors since the previous report (aggregated by the AWS Glue Metrics Dashboard as the number of bytes read during the previous minute).

glue.driver.s3.filesystem.write_bytes

Bytes The number of bytes written to Amazon S3 by the driver since the previous report (aggregated by the AWS Glue Metrics Dashboard as the number of bytes written during the previous minute).

glue.ALL.s3.filesystem.write_bytes

Bytes The number of bytes written to Amazon S3 by ALL executors since the previous report (aggregated by the AWS Glue Metrics Dashboard as the number of bytes written during the previous minute).

glue.driver.system.cpuSystemLoad

Percent The fraction of CPU system load used (scale: 0-1) by the driver.

glue.ALL.system.cpuSystemLoad

Percent The fraction of CPU system load used (scale: 0-1) by the ALL executors.

For more help

Recommendations for learning more: