AWS RDS Enhanced Monitoring integration

Access to this feature depends on your subscription level. Requires Infrastructure Pro.

New Relic Infrastructure's integrations include an integration for reporting enhanced AWS RDS data; this supplements the basic New Relic RDS integration with real-time metrics about the operating system the database instance runs on.

Features

The New Relic Infrastructure integration allows you to monitor and alert on RDS Enhanced Monitoring. You can use integration data and alerts to monitor the DB processes and identify potential trouble spots as well as to profile the DB allowing you to improve and optimize their response and cost.

Enable enhanced monitoring

Enabling this integration will incur some additional charges to your Amazon CloudWatch account. In addition there are some limitations and CPU metric data collection differences that are explained in Amazon's enhanced monitoring documentation.

You must first have the New Relic AWS RDS monitoring integration enabled before enabling RDS Enhanced Monitoring. Be sure that you have completed the steps in Connect AWS services to Infrastructure.

New Relic uses AWS Lambda with RDS to provide enhanced metrics with quick-reporting data. Follow these steps to enable the RDS Enhanced Monitoring integration.

Once enabled, a stream called RDSOSMetrics will be created on AWS CloudWatch Logs. Enhanced monitoring metrics will be available, via this stream, to the Lambda function used to obtain the data.

  1. Enable RDS Enhanced Monitoring for each of your RDS instances. When creating or modifying an instance, under Monitoring, set Enable Enhanced Monitoring to Yes, and select the data Granularity. Recommendation: 15 seconds granularity.
  2. From the Encryption keys section of the AWS Identity and Access Management (IAM) console, create a new AWS Key Management Services (KMS) encryption key that will be used to encrypt/decrypt your New Relic license key. Optional: Add tags and permissions to the encryption key. Recommendation: Use newrelic-integrations as alias. Use the ID of the new encryption key, which is the last part of the ARN, in step 4.

  3. Create a new AWS Lambda function: Select Lambda > Functions > AWS Serverless Application Repository and use the application called NewRelic-log-ingestion.
  4. Introduce the key ID that was created in step 2, and select Deploy. This action will create a new CloudFormation stack which will create a new function called newrelic-log-ingestion and the required role.
  5. Go to newrelic-log-ingestion function and introduce the LICENSE_KEY environment variable and use your New Relic account license key value. Then, encrypt it as follows:
    • Under Encryption configuration, check the Enable helpers for encryption in transit option and select the newly created newrelic-integrations KMS key.
    • Click the Encrypt button for the LICENSE_KEY environment variable.
  6. (Optional) If you are using New Relic's EU region, add the environment variable NR_REGION: EU to your function.
  7. Once the Lambda function is created, link the RDSOSMetrics log stream to the function (JSON format):

    • From AWS Console > CloudWatch > Logs, select RDSOSMetrics log group and apply Actions > Stream to AWS Lambda.
    • For the Lambda function, select newrelic-log-ingestion.
    • From Configure Log Format and Filters, select JSON as the Log Format.

Once completed, the Lambda function will send all the log lines from RDSOSMetrics to New Relic's ingest services.

Configuration and polling

You can change the polling frequency and filter data using configuration options.

Default polling information for the AWS RDS Enhanced Monitoring integration:

  • New Relic polling interval:
    • 30 seconds on average (collected via CloudWatch Logs)
    • Configurable when setting up AWS Lambda
  • Amazon CloudWatch data interval: 1 minute

Find and use data

To find your integration data in Infrastructure, go to infrastructure.newrelic.com > Integrations > Amazon Web Services and select the RDS > Enhanced monitoring dashboard link.

In New Relic Insights, data is attached to the DatastoreSample event type, with a provider value of RdsDbInstance .

For more on how to use your data, see Understand and use integration data.

Metric data

New Relic collects the following enhanced RDS data:

Metric data for all DB engines (except MS SQL Server)

Group Metrics Description

General

engine

The database engine for the DB instance.

instanceID

The DB instance identifier.

instanceResourceID

A region-unique, immutable identifier for the DB instance, also used as the log stream identifier.

numVCPUs

The number of virtual CPUs for the DB instance.

timestamp

The time at which the metrics were taken.

uptime

The amount of time that the DB instance has been active.

version

The version of the OS metrics' stream JSON format.

cpuUtilization

guest

The percentage of CPU in use by guest programs.

idle

The percentage of CPU that is idle.

irq

The percentage of CPU in use by software interrupts.

nice

The percentage of CPU in use by programs running at lowest priority.

steal

The percentage of CPU in use by other virtual machines.

system

The percentage of CPU in use by the kernel.

total

The total percentage of the CPU in use. This value excludes the nice value.

user

The percentage of CPU in use by user programs.

wait

The percentage of CPU unused while waiting for I/O access.

diskIO (not available for Amazon Aurora)

avgQueueLen

The number of requests waiting in the I/O device's queue.

avgReqSz

The average request size, in kilobytes.

await

The number of milliseconds required to respond to requests, including queue time and service time.

device

The identifier of the disk device in use.

readIOsPS

The number of read operations per second.

readKb

The total number of kilobytes read.

readKbPS

The number of kilobytes read per second.

rrqmPS

The number of merged read requests queued per second.

tps

The number of I/O transactions per second.

util

The percentage of CPU time during which requests were issued.

writeIOsPS

The number of write operations per second.

writeKb

The total number of kilobytes written.

writeKbPS

The number of kilobytes written per second.

wrqmPS

The number of merged write requests queued per second.

fileSys

maxFiles

The maximum number of files that can be created for the file system.

total

The total number of disk space available for the file system, in kilobytes.

used

The amount of disk space used by files in the file system, in kilobytes.

usedFilePercent

The percentage of available files in use.

usedFiles

The number of files in the file system.

usedPercent

The percentage of the file-system disk space in use.

loadAverageMinute

fifteen

The number of processes requesting CPU time over the last 15 minutes.

five

The number of processes requesting CPU time over the last 5 minutes.

one

The number of processes requesting CPU time over the last minute.

memory

active

The amount of assigned memory, in kilobytes.

buffers

The amount of memory used for buffering I/O requests prior to writing to the storage device, in kilobytes.

cached

The amount of memory used for caching file system–based I/O.

dirty

The amount of memory pages in RAM that have been modified but not written to their related data block in storage, in kilobytes.

free

The amount of unassigned memory, in kilobytes.

hugePagesFree

The number of free huge pages. Huge pages are a feature of the Linux kernel.

hugePagesRsvd

The number of committed huge pages.

hugePagesSize

The size for each huge pages unit, in kilobytes.

hugePagesSurp

The number of available surplus huge pages over the total.

hugePagesTotal

The total number of huge pages for the system.

inactive

The amount of least-frequently used memory pages, in kilobytes.

mapped

The total amount of file-system contents that is memory mapped inside a process address space, in kilobytes.

pageTables

The amount of memory used by page tables, in kilobytes.

slab

The amount of reusable kernel data structures, in kilobytes.

total

The total amount of memory, in kilobytes.

writeback

The amount ofn kilobytes.

network

rx

The number of bytes received per second.

tx

The number of bytes uploaded per second.

process

cpuUsedPc

The percentage of CPU used by the process.

rss

The amount of RAM allocated to the process, in kilobytes.

memoryUsedPc

The amount of memory used by the process, in kilobytes.

processName

The name of the process.

swap

cached

The amount of swap memory, in kilobytes, used as cache memory.

free

The total amount of swap memory free, in kilobytes.

total

The total amount of swap memory available, in kilobytes.

tasks

blocked

The number of tasks that are blocked.

running

The number of tasks that are running.

sleeping

The number of tasks that are sleeping.

stopped

The number of tasks that are stopped.

total

The total number of tasks.

zombie

The number of child tasks that are inactive with an active parent task.

Metric data for MS SQL

Group Metrics Description

disks

totalKb

The total space of the disk, in kilobytes.

usedKb

The amount of space used on the disk, in kilobytes.

usedPc

The percentage of space used on the disk.

availKb

The space available on the disk, in kilobytes.

availPc

The percentage of space available on the disk.

rdCountPS

The number of read operations per second

rdBytesPS

The number of bytes read per second.

wrCountPS

The number of write operations per second.

wBytesPS

The amount of bytes written per second.

memory

commitToKb

The amount of pagefile-backed virtual address space in use, that is, the current commit charge. This value is composed of main memory (RAM) and disk (pagefiles).

commitLimitKb

The maximum possible value for the commitTotKb metric. This value is the sum of the current pagefile size plus the physical memory available for pageable contents–excluding RAM that is assigned to non-pageable areas.

commitPeakKb

The largest value of the commitTotKb metric since the operating system was last started.

kernTotKb

The sum of the memory in the paged and non-paged kernel pools, in kilobytes.

kernPagedKb

The amount of memory in the paged kernel pool, in kilobytes.

kernNonpagedKb

The amount of memory in the non-paged kernel pool, in kilobytes.

pageSize

The size of a page, in bytes.

physTotKb

The amount of physical memory, in kilobytes.

physAvailKb

The amount of available physical memory, in kilobytes.

sqlServerTotKb

The amount of memory committed to Microsoft SQL Server, in kilobytes.

sysCacheKb

The amount of system cache memory, in kilobytes.

network

rdBytesPS

The number of bytes received per second.

wrBytesPS

The number of bytes sent per second.

process

cpuUsedPc

The percentage of CPU used by the process.

memUsedPc

The amount of memory used by the process, in kilobytes.

name

The name of the process.

workingSetKb

The amount of memory in the private working set plus the amount of memory that is in use by the process and can be shared with other processes, in kilobytes.

workingSetPrivKb

The amount of memory that is in use by a process, but can't be shared with other processes, in kilobytes.

workingSetShareableKb

The amount of memory that is in use by a process and can be shared with other processes, in kilobytes.

virtKb

The amount of virtual address space the process is using, in kilobytes. Use of virtual address space does not necessarily imply corresponding use of either disk or main memory pages.

system

handles

The number of handles that the system is using.

processes

The number of processes running on the system.

threads

The number of threads running on the system.

Definitions

Term Description
Event type DataStoreSample
Provider RdsDbInstance
Processes

Enhanced Monitoring allows you to monitor the following processes associated with your RDS instances. :

  • RDS Process: Shows a summary of the resources used by the RDS management agent, diagnostics monitoring processes, and other AWS processes that are required to support RDS DB instances.
  • RDS Child Process: Nested under RDS Processes, shows a summary of the RDS processes that support the DB instance, for example aurora for Amazon Aurora DB clusters and mysqld for MySQL DB instances.
  • OS Processes: Shows a summary of the kernel and system processes, which generally have minimal impact on performance.

For more help

Recommendations for learning more: