Contact Us

Americas
+1 (888) 317 6753

Europe
+33 (0)2 40 60 13 30

Asia - Africa
+230 211 1172

Contact a Partner
 
Resources
 
Resources
Request a Demo
Free Trial Download

Fault Management

 

One of the primary objectives of a CIO is to ensure non-stop IT service availability while improving IT staff efficiency and increasing customer satisfaction. Decreasing costly IT downtime is the best and most effective way to reach this goal.

An event is the key element of IT infrastructure real-time monitoring. As IT environments evolve and become more and more complex, there is a multitude of events to manage. If any status changes, the threshold overrun generates an event. Million of alerts could be received every day on the event management console, which would make it humanly impossible for IT operations to analyze and take action on all of the alerts.
And to compound the issue, the event information could be disparate and isolated in silos by technology or application.
The main difficulty presented by this deluge of data is the assessment of the real impact from an event generated at the component level, and the organization of the pertinent events from all those reported on a console. This lack of correlated and consolidated event information leads to limited insight into how certain events affect IT services and decrease the IT staff’s efficiency.

Fault management requires a solution able to:

  • Exploit all sources of information
  • Correlate a multitude of events generated at the component level
  • Sort relevant events by assessing their impact on IT services
  • Solve incidents to decrease IT downtime

Why should you use ServicePilot ISM for fault management?

  • Automatic resource discovery (networks, servers, applications, etc.)
  • Multi-format event collection (traps, syslogs, logs, etc.)
  • Advanced event correlation
  • Alarm notification and automatic action triggering
  • Cost-effective solution

Multi-format event collection

To understand the real causes of a problem and to quickly assess the consequences, it is important to be able to handle multiple sources and formats of information. ServicePilot ISM supports a large number of agents and collectors in order to deliver information as precisely and as accurately as possible, which is fundamental to any strategy for service measurement quality.

ServicePilot ISM is able to collect and analyze:

  • Results of Ping polling
  • Results of SNMP polling
  • Results of Custom Agent polling
  • Results of ServicePilot ISM application agents
  • SNMP traps received
  • Syslogs received
  • Logs

Event correlation

ServicePilot ISM includes an advanced correlation engine which is able to:

  • Receive disparate events from IT components
  • Filter the disparate events
  • Identifiy relationships between them
  • Translate them into meaningfull information to handle efficient event management

This engine allows event synchronization, amalgamation of information sources and generation of external or internal actions, such as:

  • Auto-diagnosis (based on parameters) and corrective actions
  • Pre-diagnosis, notification and dispatching

Default correlation rules are provided for the implementation of basic behaviors. They can be customized according to needs expressed by the users.

Alarm notification and automatic action triggering

The alarm notification can take several forms in ServicePilot ISM. The primary notification is made through the graphical and mapping interfaces of the Web server.
A notification module allows the user to take action on threshold overruns, receipt of alarms, a trap or a log, etc.

Properties of a ServicePilot ISM class incorporate five levels of thresholds. Different actions may be enforced for each threshold overrun.
The "automatic actions" feature of ServicePilot ISM permits the configuration of different types of actions or alarms, such as writing in a file, sending email, sending an html page report, etc.

The triggering conditions (object alarm, received syslog…) of these actions are adjustable by generic filters (temporal depreciation, depreciation on numbers of events or according to a calendar, etc.)

Criteria may be set to manage escalating events by:

  • Gradual degradation (threshold overrun)
  • Object or indicator status
  • Incident duration
  • Calendar

Following an analysis of events and/or detection of incidents on infrastructure, ServicePilot ISM can transmit enhanced information to a trouble ticket management solution. By default, this information is sent by e-mail.

Reduced investment and operating costs

  • Simple, scalable and flexible licensing
    • Up to tens of thousands of managed events daily without any licensing disruption
    • Ability to handle metrics and performance management with the same tool, without any need for license add-ons or additional costs

  • Out-of-the box value
    • Fast and easy to install, deploy, configure and use
    • Customer focused: fast ans efficient response time by our support team

Integration example