VMware vSphere monitoring integration

New Relic's VMware vSphere integration helps you understand the health and performance of your vSphere environment. You can:

  • Query data to get insights on the performance on your hypervisors, virtual machines, and more.
  • Go from high level views down to the most granular data.
Sample dashboard - VMware vSphere Integration
vSphere data visualized in a New Relic One dashboard: operating systems, status, average CPU and memory consumption, and more.

Our integration uses the vSphere API to collect metrics and events generated by all vSphere's components, and forwards the data to our platform via the infrastructure agent.

Why it matters

With our vSphere integration you can:

  • Instrument and monitor multiple vSphere instances using the same account.

  • Collect data on snapshots, VMs, hosts, resource pools, clusters, and datastores, including tags.

  • Monitor the health of your hypervisors and VMs using our charts and dashboards.

  • Use the data retrieved to monitor key performance and key capacity scaling indicators.

  • Set alerts based on any metrics collected from vCenter.

  • Create workloads to group resources and focus on key data (see Workloads in New Relic).

    vSphere data in New Relic Workloads
    You can create workloads in New Relic using data collected via the vSphere integration.

Compatibility and requirements

Our integration is compatible with VMware vSphere 6.5 or higher.

Before installing the integration, make sure that you meet the following requirements:

Install and activate

To install the vSphere integration, choose your setup:

Linux installation
  1. Follow the instructions for installing an integration, using the file name nri-vsphere.
  2. Change the directory to the integrations folder:
    cd /etc/newrelic-infra/integrations.d
  3. Copy of the sample configuration file:
    sudo cp vsphere-config.yml.sample vsphere-config.yml
  4. Edit the vsphere-config.yml file as described in the configuration settings.
  5. Restart the infrastructure agent.
Windows installation
  1. Download the nri-vsphere MSI installer image from:

    https://download.newrelic.com/infrastructure_agent/windows/integrations/nri-vsphere/nri-vsphere-amd64.msi

  2. To install from the Windows command prompt, run:
    msiexec.exe /qn /i PATH\TO\nri-vsphere-amd64.msi
  3. In the Integrations directory, C:\Program Files\New Relic\newrelic-infra\integrations.d\, create a copy of the sample configuration file by running:

    cp vsphere-config.yml.sample vsphere-config.yml
  4. Edit the vsphere-config.yml file as described in the configuration settings.
  5. Restart the infrastructure agent.
Tarball installation (advanced)

You can also install the integration from a tarball file. This gives you full control over the installation and configuration process.

Configure the integration

An integration's YAML-format configuration is where you can place required login credentials and configure how data is collected. Which options you change depend on your setup and preference.

To configure the vSphere integration, you must define the URL of the vSphere API endpoint, and your vSphere username and password. For configuration examples, see the sample configuration files.

Some features of the vSphere integration are optional and can be enabled via configuration settings.

With secrets management, you can configure on-host integrations with New Relic Infrastructure's agent to use sensitive data (such as passwords) without having to write them as plain text into the integration's configuration file. For more information, see Secrets management.

To collect vSphere performance metrics, use the ENABLE_VSPHERE_PERF_METRICS environment variable.

Data is collected according to the settings in the vsphere-performance.metrics configuration file. You can override the location of the performance metrics config file using PERF_METRIC_FILE environment variable. Notice that the integration follows VMware's data collection levels (1 to 4).

When ENABLE_VSPHERE_PERF_METRICS is set, all level 1 metrics are collected. The data collection level of the performance metrics collected can be modified using PERF_LEVEL. Each metric in the config file can be commented out and new ones can be added if needed.

Collection of performance data can increase the load in vCenter and the time needed by to collect data. We recommended to only include the metrics you need in the configuration file.

To fine-tune data collection, the number of entities and metrics retrieved per request can be modified using BATCH_SIZE_PERF_ENTITIES and BATCH_SIZE_PERF_METRICS.

All metrics collected are included in the corresponding sample with the perf. prefix attached to the name. For example, net.packetsRx.summation is collected and sent as perf.net.packetsRx.summation.

For more information on vSphere performance metrics, see the VMware documentation.

To collect vSphere events, use the ENABLE_VSPHERE_EVENTS environment variable.

The integration collects events between the current time and the last fetched event for each datacenter. It stores the information regarding the last fetched event in a cache that is updated after each execution. Events are only available if the integration is connected to a vCenter and not directly to an ESXi host.

The number of events collected per request can be tuned by modifying EVENTS_PAGE_SIZE, which is set to 100 by default.

Events are available in the Events page and can be queried via NRQL as InfrastructureEvent under vSphereEvent. Here is an example of vSphere events data:

"summary": "User dcui@127.0.0.1 logged out (login time: Tuesday, 14 July, 2020 08:32:09 AM, number of API invocations: 0, user agent: VMware-client/6.5.0)",
"vSphereEvent.computeResource": "cluster1",
"vSphereEvent.datacenter": "Prod Datacenter",
"vSphereEvent.date": "Tue, 14 Jul 2020 09:03:51 UTC",
"vSphereEvent.host": "192.168.0.230",
"vSphereEvent.userName": "dcui"

To collect snapshot data, use the ENABLE_VSPHERE_SNAPSHOTS environment variable.

Snapshot data can be found in VSphereSnapshotVmSample. Collected data covers total and unique space occupied by disk and memory files, snapshot tree, and creation time.

You can use this information to create NRQL queries, dashboards, and alerts, since it's linked to the corresponding virtual machine entity.

To collect vSphere tags, use the ENABLE_VSPHERE_TAGS environment variable.

Tags are available as attributes in the corresponding entity sample as label.tagCategory:tagName.

If two tags of the same category are assigned to a resource, they are added to a unique attribute separated by a pipe character. For example: label.tagCategory:tagName|tagName2.

Tags can be used to run NRQL queries, filter entities in the entity explorer, and to create dashboards and alerts.

Resource filtering allows you to specify which resources you want to monitor by declaring a set of tags that resources must have in order to be monitored.

Resources require a match on any (one or more) of the filter tags in order to be included. If none of the resource tags match any of the filter tags, no information about that resource is sent to New Relic.

To use filtering resources by tag you need to have the ENABLE_VSPHERE_TAGS environment variable enabled.

A tag filter expression is a space-separated list of pairs of strings with the format category=name.

For example, to only retrieve resources with a tag category region and include regions us and eu use a filter expression like: region=us region=eu

INCLUDE_TAGS: >
  region=us
  region=eu

To enable resource filtering by tag, edit your integration configuration file and add the option INCLUDE_TAGS with the filter expression you want.

Note that datacenter resources acting as the root of the resource tree MUST have tags attached AND match the filter expression in order for other child resources to be fetched.

If you connect the integration directly to the ESXi host, vCenter data is not available (for example, events, tags, or datacenter metadata).

Here are examples of the vSphere integration configuration, including performance metrics:

For more information, see our documentation about the general structure of on-host integration configurations.

The configuration option inventory_source is not compatible with this integration.

Update your integration

On-host integrations do not automatically update.

For best results, regularly update the integration package and the Infrastructure agent.

View and use data

Data from this service is reported to an integration dashboard. You can query this data for troubleshooting purposes or to create charts and dashboards.

vSphere data is attached to these event types:

  • VSphereHostSample
  • VSphereClusterSample
  • VSphereVmSample
  • VSphereDatastoreSample
  • VSphereDatacenterSample
  • VSphereResourcePoolSample
  • VSphereSnapshotVmSample

Performance data is enabled and configured separately (see Enable and configure performance metrics).

For more on how to view and use your data, see Understand integration data.

Metric data

The vSphere integration provides the following metric data:

Name Description

mem.size

Physical memory size (Mebibytes)

mem.usage

Used physical memory (Mebibytes)

mem.free

Free physical memory (Mebibytes)

cpu.cores

Number of CPU cores

cpu.threads

Number of physical CPU threads

cpu.coreMHz

Speed of the CPU cores (Megahertz)

cpu.totalMHz

Total MHz for all the individual cores (Megahertz)

cpu.percent

Aggregated % of CPU usage across all cores

cpu.overallUsage

Aggregated CPU usage across all cores (Megahertz)

cpu.available

Aggregated available CPU across all cores (Megahertz)

disk.totalMiB

Disk usage (Mebibytes)

vmCount

Number of hosts in the datacenter

clusterName

Cluster name

overallStatus

Overall status

datacenterName

Datacenter name

resourcePoolNameList

Resource Pool name list

datastoreNameList

Datastore name list

datacenterLocation

Datacenter location

hypervisorHostname

Hypervisor hostname

inQuarantineMode

In Quarantine mode

bootTime

Boot time

connectionState

Connection state

inMaintenanceMode

In Maintenance mode

powerState

Power state

standbyMode

Standby mode

cryptoState

Crypto state

networkNameList

Network name list

uuid

Unique identifier
Name Description

mem.size

Memory size (Mebibytes)

mem.usage

Used memory (Mebibytes)

mem.free

Free memory (Mebibytes)

mem.hostUsage

Host memory usage (Megabytes)

mem.balloned

Size of the balloon driver (Megabytes)

mem.swapped

Amount of memory granted from the host's swap space (Megabytes)

mem.swappedSsd

Amount of memory swapped to fast disk device, such as SSD (Megabytes)

cpu.cores

Number of CPU cores

cpu.overallUsage

Aggregated CPU usage across all cores (Megahertz)

cpu.allocationLimit

Max amount of CPU MHz available to VM. -1 if none (Megahertz)

cpu.hostUsagePercent

CPU usage (Percent)

disk.totalUncommittedMiB

Uncommitted disk capacity (Mebibytes)

disk.totalMiB

Disk size (Mebibytes)

disk.totalUnsharedMiB

Unshared disk capacity (Mebibytes)

disk.suspendMemory

Size of the snapshot file (bytes)

disk.suspendMemoryUnique

Size of the snapshot file, unique blocks (bytes)

overallStatus

Overall status

datacenterLocation

Datacenter location

clusterName

Cluster name.

datacenterName

Datacenter Name

hypervisorHostname

Hypervisor hostname

resourcePoolName

Resource pool name

datastoreNameList

Datastore name list

vmHostname

VM hostname

vmConfigName

VM config name

instanceUuid

Instance UUID

networkNameList

Network name list

operatingSystem

Operating system

guestFullName

Guest full name

ipAddress

IP Address

connectionState

Connection state

powerState

Power state
Name Description

capacity

Storage size (Gibibytes)

freeSpace

Available space (Gibibytes)

hostCount

Number of hosts using this datastore

uncommitted

Uncommited space (Gibibytes)

vmCount

Number of VMs using this datastore

datacenterLocation

Datacenter location

datacenterName

Datacenter name

name

Name of the datastore

file System Type

File system type

overallStatus

Overall status

accessible

Accessible

url

URL of the datastore

nas.remoteHost

NAS remote host

nas.remotePath

NAS remote path
Name Description

cpu.overallUsagePercentage

CPU usage (Percent)

mem.usagePercentage

Memory usage (Percent)

mem.size

Memory size (Mebibytes)

mem.usage

Used memory (Mebibytes)

cpu.cores

Number of CPU cores

cpu.overallUsage

Aggregated CPU usage across all cores (Megahertz)

cpu.totalMHz

Aggregated CPU resources of all hosts, in MHz (Megahertz)

datastore.totalGiB

Capacity (Gibibytes)

datastore.totalFreeGiB

Available space (Gibibytes)

datastore.totalUsedGiB

Used space (Gibibytes)

datastores

Number of datastores

hostCount

Number of hosts using this datastore

vmCount

Number of VMs using this datastore

networks

Number of networks using this datastore

resourcePools

Number of resource pools using this datastore

clusters

Number of clusters using this datastore

overallStatus

Overall status
Name Description

mem.size

Size of the memory (Mebibytes)

mem.usage

Used memory (Mebibytes)

mem.free

Free memory (Mebibytes)

mem.ballooned

Ballooned memory (Mebibytes)

mem.swapped

Amount of memory dedicated to swap (Mebibytes)

cpu.overallUsage

Aggregated CPU usage (Megahertz)

cpu.totalMHz

Aggregated CPU capacity (Megahertz)

vmCount

Number of VMs in this resource pool

resourcePoolName

Name of the resource pool

datacenterLocation

Location of the datacenter

datacenterName

Name of the datacenter

clusterName

Name of the cluster

overallStatus

Overall status
Name Description

cpu.core

Number of CPU cores

cpu.threads

Aggregated number of CPU threads

cpu.totalEffectiveMHz

Effective CPU resources available to run virtual machines (Megahertz)

cpu.totalMHz

Aggregated CPU resources of all hosts (Megahertz)

mem.size

Aggregated memory resources of all hosts (Megabytes)

mem.effectiveSize

Effective memory resources available to run virtual machines (Megabytes)

effectiveHosts

Number of effective hosts

hosts

Number of hosts

drsConfig.vmotionRate

Threshold for generated VcClusterRecommendations.

dasConfig.restartPriorityTimeout

Maximum time the lower priority VMs should wait for the higher priority VMs to be ready (Seconds)

datacenterLocation

Datacenter location

datacenterName

Datacenter name

networkList

Network list

hostList

Host list

datastoreList

Datastore list

overallStatus

Overall status

drsConfig.enabled

DRS config enabled

drsConfig.enableVmBehaviorOverrides

Enable overrides for DRS config

drsConfig.defaultVmBehavior

DRS config default VM behavior

dasConfig.enabled

DAS config enabled

dasConfig.admissionControlEnabled

DAS config admission control enabled

dasConfig.isolationResponse

DAS config isolation response

dasConfig.restartPriority

DAS config restart priority

dasConfig.hostMonitoring

DAS config host monitoring

dasConfig.vmMonitoring

DAS config VM monitoring

dasConfig.vmComponentProtecting

DAS config VM component protecting

dasConfig.hbDatastoreCandidatePolicy

DAS config hb datastore candidate policy
Name Description

snapshotTreeInfo

Tree info for the snapshot. Es: Cluster:Vm:Snapshot1:Snapshot2

name

Snapshot name

creationTime

Snapshot creation time

powerState

The power state of the virtual machine when this snapshot was taken

snapshotId

The unique identifier that distinguishes this snapshot from other snapshots of the virtual machine

quiesced

Flag to indicate whether or not the snapshot was created with the "quiesce" option, ensuring a consistent state of the file system

backupManifest

The relative path from the snapshotDirectory pointing to the backup manifest. Available for certain quiesced snapshots only

description

Description of the snapshot

replaySupported

Flag to indicate whether this snapshot is associated with a recording session on the virtual machine that can be replayed

totalMemoryInDisk

Total size of memory in disk.

totalUniqueMemoryInDisk

Total size of the file corresponding to the file blocks that were allocated uniquely to store memory. In other words, if the underlying storage supports sharing of file blocks across disk files, the property corresponds to the size of the file blocks that were allocated only in context of this file, i.e. it does not include shared blocks that were allocated in other files. This property will be unset if the underlying implementation is unable to compute this information.

totalDisk

Total size of snapshot files in disk

totalUniqueDisk

Total size of the file corresponding to the file blocks that were allocated uniquely to store snapshot data in disk. In other words, if the underlying storage supports sharing of file blocks across disk files, the property corresponds to the size of the file blocks that were allocated only in context of this file, i.e. it does not include shared blocks that were allocated in other files. This property will be unset if the underlying implementation is unable to compute this information.

datastorePathDisk

Disk file path in the datastore

datastorePathMemory

Memory file path in the datastore

Check the source code

This integration is open source software. That means you can browse its source code and send improvements, or create your own fork and build it.

For more help

If you need more help, check out these support and learning resources: