• EnglishEspañol日本語한국어Português
  • Log inStart now

NVML integration

Integrating NVML with New Relic provides valuable insights into the GPU utilization and performance metrics of your applications and systems, facilitating resource optimization, performance bottleneck identification, and the maintenance of overall stability and efficiency in your environment.

After setting up the NVML integration with New Relic, see your data in a dashboard right out of the box.

Set up the NVML integration

Complete the following steps to set up the NVML integration:

Install the infrastructure agent

To use the NVML integration, you need to first install the infrastructure agent on the same host. The infrastructure agent monitors the host itself, while the nvml integration extends your monitoring with data specific to your gpu clusters.

Use NRI-Flex to capture metrics

Flex comes bundled with the New Relic infrastructure agent. You need to configure NRI-Flex for nvml and create a flex configuration file. Follow these steps:

  1. Create a file named nvml-config.yml on the path below:

    • for Linux, /etc/newrelic-infra/integrations.d
    • for windows, C:\Program Files\New Relic\newrelic-infra\integrations.d\
  2. Use the below snippet to update your configuration file named nvml-config.yml

    integrations:
    - name: nri-flex
    # interval: 30s
    config:
    name: NVMLexample
    apis:
    - name: nvml
    file: <PATH_TO_METRIC_CSV_FILE>

Restart the infrastructure agent

Use the instructions in our infrastructure agent docs to restart your infrastructure agent. This is a basic command that should work for most people:

bash
$
sudo systemctl restart newrelic-infra.service

View your nvml metrics in New Relic

Once you've completed the setup above, you can view your metrics using our pre-built dashboard template. To access this dashboard:

  1. Go to one.newrelic.com > + Add data.
  2. Click on the Dashboards tab.
  3. In the search box, type nvml.
  4. Select it and click Install.

To instrument the nvml quickstart and to see metrics and alerts, you can also follow our NVML quickstart page by clicking on the Install now button.

Here's an example query to check the number of devices in GPU:

SELECT latest(temperature_gpu) FROM nvmlSample TIMESERIES

What's next?

To learn more about building NRQL queries and generating dashboards, check out these docs:

  • Introduction to the query builder to create basic and advanced queries.
  • Introduction to dashboards to customize your dashboard and carry out different actions.
  • Manage your dashboard to adjust your display mode, or to add more content to your dashboard.
Copyright © 2024 New Relic Inc.

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.