Application Monitoring

nvidia smi applications monitoring

What is Nvidia SMI?

Nvidia SMI is system management interface command line utility based on top of the NVIDIA Management Library (NVML). It can be used to monitor NVIDIA GPU devices. The nvidia-smi CLI utility targes Tesla, GRID, Quadro and Titan X products though limited support is also available on other NVIDIA GPUs.

How to monitor Nvidia SMI?

ServicePilot makes it easy to monitor Nvidia GPUs, requiring only the installation of a ServicePilot Agent on the target server. A resource of the appmon-nvidia-smi package then needs to be added via the ServicePilot web interface.

The statistics gathered in this way include:

  • Fan speed
  • Memory usage
  • Power usage
  • Temperature
  • Encoder stats

How to install a nvidia-smi resource?

  1. Use your ServicePilot OnPremise installation or a SaaS account.
  2. Add a new nvidia-smi resource via the web interface (/prmviews or /prmresources) or via API (/prmpackages page), the default ServicePilot agent or another agent will be provisioned automatically.

Details of the nvidia-smi package are located in the /prmpackages page of the software.

Benefits

ServicePilot enables you to deliver IT services faster and more securely with automated discovery and advanced monitoring features.

By correlating the technology NVIDIA SMI with APM and infrastructure monitoring, ServicePilot is able to provide a more comprehensive view of an organization's IT environment.

This allows IT teams to quickly identify and diagnose issues that may be impacting application performance, and take corrective action before end-users are affected.

Start with a free trial of our SaaS solution. Explore our plans or contact us to find what works best for you.

Learn more

Free installation in
a few clicks

SaaS Plateform

Flexible deployment according to your needs (SaaS, hybrid, on-premise) to speed up supervision implementation.
  • No on-premise software setup, servicing and configuration complexity
  • Instant setup, complete and pre-configured to ensure robust monitoring

OnPremise Plateform

Flexible deployment according to your needs (SaaS, hybrid, on-premise) to speed up supervision implementation.
  • Contracts and commitments over time ( > 1 year)
  • Performance, Data Storage and Infrastructure Management
  • 2 additional solutions: VoIP and Mainframe monitoring