Administration
APM

Application observability

The pillars of observability

Observability is essential to guarantee the reliability, performance and resilience of applications and infrastructures. It enables teams to proactively diagnose and resolve application problems, by collecting and analyzing data from various sources, such as metrics, traces and logs.

In the context of modern application development, observability refers to the collection and analysis of this data to provide detailed information on the behavior of applications. It is essential to today’s dynamic architectures and multi-Cloud computing environments, enabling software engineering, IT, DevOps and SRE teams to collaborate in making quick decisions based on telemetry data.

ServicePilot uses the 5 main types of monitoring to offer complete application observability:

Distributed traces

Distributed tracing tracks transactions and calls between microservices or application components, precisely measuring latency and dependencies. This helps identify bottlenecks or failing services.

Network flows

Observation of network flows on servers or hosts provides visibility of inter-application traffic, exchanged volumes, network response times, connectivity errors and suspicious behavior.

Real User Monitoring (RUM)

RUM collects data directly from real users’ browsers or mobile applications. This enables us to understand their experience (loading times, JS errors, slowness…) in the real context of use.

Synthetic monitoring

This type of monitoring simulates user paths using scripts or robots. It enables continuous testing of service availability, latency and functional behavior, independently of real traffic.

Application logs

Centralized analysis of logs (system, application, security…) helps diagnose incidents, enrich alerts or investigate abnormal behavior.

Each pillar covers a specific facet of the digital environment. By combining them, ServicePilot enables:

Intelligent data correlation
Proactive anomaly detection
Fast, relevant RCA (Root Cause Analysis)
User experience-oriented observability

Distributed traces

What is application tracing?

In dynamic architectures based on microservices and distributed components, understanding the complete path of a user request through the application can be a major challenge. Application traces allow you to track each request from the frontend to the backend, recording its path through all the services, APIs, databases, and hosts involved. It provides a detailed, chronological map of the exchanges between components.

Distributed Traces

In an environment composed of dozens or even hundreds of microservices, each user action (such as loading a page or validating a form) can trigger cascading calls between services. Static pages can be served by CDN, requests can depend on sub-requests in databases…

Thanks to AppTrace technologies, it is easy to monitor:

Client/server relationships (who calls whom)
How long each step of a request takes
Where slowdowns or errors occur in the execution path

You can view the complete chain of execution for a complex transaction and quickly isolate:

Slow or overloaded services
Faulty external dependencies
Timeouts or blocked calls

Collecting APM traces and data

Application instrumentation is an essential step in enabling distributed tracing (APM) and collecting accurate traces from your applications. With ServicePilot, this can be fast, flexible and adaptable even in complex environments.

Configure the application to send traces, metrics and logs to a ServicePilot Agent using OTPL (OpenTelemetry Protocol) over http/protobuf, Datadog or Zipkin/HTTP APIs. See details by language and framework below.
Configure the ServicePilot Agent to listen for Application traces by setting up an Automatic provisioning rule under SETUP > Parameters > Auto-provisioning. Add or edit an Auto-provisioning rule with AppTrace Port activated and the correct listening ports specified.
Finally, create an APM Rule from Parameters to refine the application definitions and instrumentation details.

If an application is already instrumented with an Open Source tracing standard such as OpenTelemetry, Datadog or Zipkin, ServicePilot can integrate natively to collect APM traces from the existing instrumentation.

If the application is not yet instrumented:

Administrators may instrument their applications automatically with an OpenTelemetry Zero-code setup or by using DataDog SDK automatic application instrumentation libraries.
Developers can use the OpenTelemetry API and SDK to instrument their applications to send metrics, traces and logs to a ServicePilot Agent. The data can be sent directly to a ServicePilot Agent or aggregated using an OpenTelemetry Collector and then exported to a ServicePilot Agent.

Instrumentation modes and languages

OpenTelemetry Zero-Code

DataDog Libraries

Zipkin

OpenTelemetry API and SDK

Language	Documentation
.NET	.NET zero-code instrumentation
Go	Go zero-code instrumentation
Java	Java zero-code instrumentation
JavaScript	JavaScript zero-code instrumentation
PHP	PHP zero-code instrumentation
Python	Python zero-code instrumentation

Amongst other configuration, the following parameters are needed to integrate with ServicePilot:

OTEL_DOTNET_AUTO_INSTRUMENTATION_ENABLED="true"
OTEL_DOTNET_AUTO_TRACES_INSTRUMENTATION_ENABLED="true"
OTEL_DOTNET_AUTO_METRICS_INSTRUMENTATION_ENABLED="true"
OTEL_DOTNET_AUTO_LOGS_INSTRUMENTATION_ENABLED="false"
OTEL_TRACES_EXPORTER="otlp"
OTEL_METRICS_EXPORTER="otlp"
OTEL_METRIC_EXPORT_INTERVAL="60000"
OTEL_LOGS_EXPORTER="none"
OTEL_EXPORTER_OTLP_ENDPOINT="http://<ServicePilot Agent host IP address>:4318"
OTEL_INSTRUMENTATION_HTTP_SERVER_CAPTURE_REQUEST_HEADERS="Content-Length,Transfer-Encoding,Range,Request-Range,Connection,Host,Accept,x-up-devcap-post-charset,Cache-Control,Referer,X-Filename"
OTEL_EXPORTER_OTLP_PROTOCOL="http/protobuf"
OTEL_EXPORTER_OTLP_TIMEOUT="10000"

Language	Documentation
Java	Tracing Java Applications
Python	Tracing Python Applications
Ruby	Tracing Ruby Applications
Node.js	Tracing Node.js Applications
.NET Core	Tracing .NET Core Applications
.NET Framework	Tracing .NET Framework Applications
PHP	Tracing PHP Applications

The following parameters are needed to integrate with ServicePilot:

DD_AGENT_HOST="<ServicePilot Agent host IP address>"
DD_TRACE_CLIENT_IP_ENABLED="true"
DD_RUNTIME_METRICS_ENABLED="true"
DD_PROFILING_ENABLED="false"
DD_TRACE_HEADER_TAGS="Content-Length,Transfer-Encoding,Range,Request-Range,Connection,Host,Accept,x-up-devcap-post-charset,Cache-Control,Referer,X-Filename"

Zipkin Tracers and Instrumentation documents libraries supporting the instrumentation of application code to send traces to a ServicePilot Agent.

Language	Documentation
C#	Zipkin4net
Go	Zipkin Library for Go
Java	Brave
JavaScript	Zipkin JS
Ruby	ZipkinTracer: Zipkin client for Ruby
Scala	zipkin-finagle

Use the Zipkin/HTTP protocol to export data to a ServicePilot Agent on TCP port 9411

Developers can use OpenTelemetry instrumentation libraries to integrate traces, metrics and logs into their application.

Language	Documentation
C++	C++ code instrumentation
C#/.NET	.NET code instrumentation
Erlang/Elixir	Erlang/Elixir code instrumentation
Go	Go code instrumentation
Java	Java code instrumentation
JavaScript	JavaScript code instrumentation
PHP	PHP code instrumentation
Python	Python code instrumentation
Ruby	Ruby code instrumentation
Swift	Swift code instrumentation

Existing integrations for applications and frameworks are also available.

Instrumentation libraries hosted in OpenTelemetry repositories.
OpenTelemetry maintains a list of OpenTelementry Instrumentation integrations in their registry.

APM Trace data visualization

The collected data is available from several ServicePilot interfaces, offering both global dashboards and dedicated interfaces for contextualized diagnosis.

The standard dashboards for APM traces are available with consolidated or individual views under the DASHBOARDS section in AppTrace > AppService or AppHost or AppSummary depending on the desired level of granularity.

The collected data can be viewed globally by selecting the desired category.
The data can also be viewed for a specific item in a category.

Further interfaces allow for granular exploration of requests under the DASHBOARDS section in or AppTrace > Requests, AppTrace > Applications, AppTrace > L7 Map or AppTrace > Profiler.

When the Traceid column contains a magnifying glass icon, you can drill down to the APM trace to view the transaction details for a query.

The Requests AppTrace page provides a detailed analysis of application transactions. The data presented offers a precise analysis of application performance and behavior, including the number of requests per minute per transaction, user satisfaction, and other application metrics.

The L7 Map AppTrace page provides a relational display of your various systems by section. It allows you to identify the various problems that could be encountered by the monitored applications. You can then use the architecture display to quickly find which server or service is causing the incident and resolve the problem as quickly as possible.

Network flows

What are host network flows?

NetTrace is ServicePilot’s technology that allows you to capture and analyze in depth the incoming and outgoing network exchanges of a machine (Windows / Linux / IBM z/OS). By monitoring the network flows of several servers, you can observe the exchanges between groups of hosts and between the application components of systems.

By monitoring network flows on servers and/or containers, you can analyze:

Who is talking to whom?
On which ports and protocols?
How often?
With what data volumes…

The ServicePilot Agent captures IP flows and produces structured network conversation summaries in addition to detailed real-time interfaces. The web interfaces provide clear and interactive visualizations of network communications within infrastructures.

What is it used for?

NetTrace is an application-oriented network visibility system tool that allows you to:

Map dependencies between applications, services, servers, or microservices.
Identify problems: latency, saturation, TCP retransmissions, errors, etc.
Detect abnormal or suspicious behavior: unexpected exchanges, non-standard ports, outbound traffic, etc.
Validate the compliance of network flows (with regard to security rules, segmentation, firewalls, or trusted zones).
Standardize the monitoring of system flows regardless of the hosting choice (Cloud, Hybrid, On-Premise).

Collecting network flow data

To collect network traces, simply install a ServicePilot Agent on the Hosts to be monitored. Then, create an auto-provisioning rule checking the NetTrace box in the ServicePilot interface under the SETUP section in Parameters > Auto-provisioning.

NetTrace data visualization

NetServer page

The collected data is available from several ServicePilot interfaces, offering both global dashboards and dedicated interfaces for contextualized diagnosis.

The standard dashboards for servers/applications flows are available with consolidated or individual views under the DASHBOARDS section in NetTrace > NetHost or NetProcess depending on the desired level of granularity.

Further interfaces allow for granular exploration of conversations under the DASHBOARDS section in NetTrace > Conversations, NetTrace > L4 Map, NetTrace > Public or NetTrace > PCAP.

The Conversations NetTrace page allows you to view the various connections established between servers and/or applications monitored by ServicePilot Agents. The filter column allows you to target your searches by filtering by IP, port, protocol, etc. to view specific network traffic data (conversations, blocked connections, rejected connections, bytes per second, etc.).

The L4 Map NetTrace page creates a relational display of your various systems by section. It is possible to identify the various problems that could be encountered by the monitored systems on your network. You can then use the architecture display to quickly find which server or service is causing the incident and resolve the problem as quickly as possible.

The Public NetTrace page displays incoming/outgoing communications from/to public IP addresses.

The PCAP NetTrace page provides a precise and rapid visualization of all traffic passing through a network in real time. After selecting a network or host, it is possible to view the data and various associated links in several ways, in table form or in graphs, to obtain a real-time overview of the status of the selected network. This PCAP page also offers a very interesting feature that allows you to capture network traffic on a machine at any time and create a PCAP trace using various filters (IP, ports, protocol, etc.) that can be entered and automatically downloaded from the browser.

Real User Monitoring

What is Real User Monitoring?

Real User Monitoring (RUM) enables you to observe the performance and behavior of your web applications’ real users, directly from their browsers. Unlike Synthetic Monitoring, which relies on simulated tests, RUM measures the user experience as it actually is, taking into account network conditions, terminal type, geography and the client environment.

With RUM, ServicePilot collects valuable data such as:

Page load times
JavaScript errors encountered
Network and application performance
User geolocation
Types of browsers, OS, and screen resolutions

This enables us to understand, measure and improve the true user experience on an ongoing basis, from both a technical and ergonomic point of view.

What is RUM session replay?

RUM Session Replay

Session Replay lets you record and replay user interactions with your web application such as clicks, mouse movements, navigation and errors. It complements Real User Monitoring by providing a visual player for behavioral analysis of the user experience.

This makes it possible to revisit a user’s journey and path to effectively diagnose usability problems, performance issues or functional bugs.

Collecting RUM data

To collect Real User Monitoring (RUM) metrics on your web pages, you need to integrate the ServicePilot RUM script.

Depending on your environment, several integration methods are possible:

Java application servers (Tomcat, Jetty). Use a dedicated ServicePilot plugin to automatically inject the RUM script into the HTML responses generated by your applications. No manual code modification is required. This enables seamless, centralized integration into classic Java Web environments, based on JSP, servlets or frameworks such as Spring MVC.
Web servers / Proxies (Apache, NGINX, IIS…). Configure your servers or proxies to modify the HTML pages served, by dynamically injecting the RUM script. This method may be preferable when you cannot modify the application code but control the web delivery layer. For example, IIS allows you to use the URL Rewrite extension with an HTML injection module.
Static web pages or SPA applications. Manually add the RUM script to the source code of your web pages, ideally in the <head> section. This is suitable for static HTML sites, Single Page Applications (React, Angular, Vue.js…) or CMS integrations (WordPress, Drupal…). Manual insertion into the code also enables fine instrumentation, page by page or conditional depending on the environment.

Where to get the RUM script?

Detailed instructions for RUM instrumentation are available from the ServicePilot interface under SETUP > Parameters > APM rules > RUM instrumentation. Here you will find the ready-to-use script, plus configuration options tailored to your specific use cases.

Activating session replay

Once the corresponding option has been activated in the RUM JavaScript configuration, the updated script deployed on the targeted pages will collect sessions from users of the supervised application.

RUM & session replay data visualization

The data collected is available from several ServicePilot interfaces, offering both global dashboards and dedicated interfaces for contextualized diagnosis.

Standard ServicePilot RUM dashboards with consolidated or individual views are available under the DASHBOARDS section in User Experience > Rum.

When looking at RUM Requests Details, the Replay column contains an icon when a Session Replay is available.

Synthetic monitoring

What is synthetic monitoring?

Synthetic Monitoring is a monitoring technique that simulates one or more user actions on a website, independently of real traffic. Testing critical pages or user paths at regular time intervals enables you to monitor the availability, performance and application performance of Web services.

Collecting synthetic monitoring data

ServicePilot offers several packages for implementing synthetic monitoring tailored to your needs:

ServicePilot webcheck - HTTP(S) checks

The user-webcheck package enables you to monitor server responses using an HTTP(S) query issued by a ServicePilot Agent:

Collects HTTP code, response time, SSL certificate information.
Supports GET/POST requests, with customized headers, expected HTTP code…
Extracts numerical data from the page (e.g. number of elements, counter value).

Although seemingly basic, this package can become a powerful monitoring tool when deployed strategically. By increasing the number of test points (via ServicePilot Agents positioned in different geographical areas, behind proxies, or on networks with variable latency), it is possible to obtain a fairly realistic and distributed view of the user experience.

ServicePilot web-scenario - multi-step scenarios

The user-web-scenario package allows you to monitor server response times via a series of HTTP(S) requests issued by the ServicePilot Agent. Each step of the scenario can also be customized as required.

All requests are executed by the ServicePilot Agent at regular intervals, providing continuous monitoring of performance.

Integration of external functional tests

To complement the native packages, ServicePilot offers various standard packages for integrating functional test results from third-party tools. These results can be fed back into ServicePilot in the form of time-stamped reports, enriching dashboards with automated test data.

3rd Party Software	Description	ServicePilot Packages
Lighthouse	Automated auditing tool developed by Google to evaluate the performance, accessibility, SEO and best practices of web pages.	Lighthouse Integration
Puppeteer	Node.js library for automating a Chrome or Chromium browser via a high-level API, even in SPAs and on dynamic content. It can be used to emulate complex scenarios involving navigation, clicks, input, delays and screenshots.	Puppeteer Integration
NightWatchJS	E2E framework based on Node.js and Selenium. Ideal for validating critical flows with assertions (presence of text, HTTP status, completed fields, etc.).	NightWatchJS Integration
Playwright	Cross-browser solution for testing on Chrome, Firefox, Safari. Supports parallel testing, visual assertions and rich interactions (drag & drop, uploads…).	Playwright Integration
SikuliX	Uses visual recognition to automate GUI-based interactions. Very useful when DOM elements are inaccessible or dynamic (ideal for legacy or non-HTML applications).	SikuliX Integration

Synthetic monitoring data visualization

ServicePilot offers standard dashboards, with consolidated or individual views of data under the DASHBOARDS section in User Experience > Resources > Web-Scenario or WebCheck.

Synthetic Monitoring data from third-party tools are centralized in dedicated dashboards under the DASHBOARDS section in Appmon > Resources > [package name].

Application logs

Application, system and security logs are an important source for in-depth incident diagnosis or for enriching alerts with context. In a modern approach to observability, logs complement application traces, metrics and synthetic tests.

ServicePilot supports ingesting major log formats, allowing it to seamlessly collect and normalize data from diverse applications and infrastructure. Whether logs are produced in W3C IIS web server format, traditional Syslog streams, Windows Events from Microsoft environments or other common standards, ServicePilot unifies them into a consistent analytics pipeline. This flexibility ensures that teams can monitor, correlate and analyze events across heterogeneous environments.

Among all the different log formats (W3C, Syslog, Windows event logs, CEF, LEEF, etc.), the main types of logs can be grouped into a set of functional categories. These categories describe what the logs represent, regardless of their format.

Log type	Description
Access logs	Record who accessed a service, when, from where and how. Common in W3C, CLF, JSON, and proxy logs.
Event logs	Capture system or application events such as service starts, configuration changes or warnings. Typical in Syslog and Windows Event Logs.
Error logs	Report failures, exceptions, crashes and unexpected behavior. Present in nearly all formats.
Security logs	Authentication, authorization, firewall events, intrusion attempts. Often structured in CEF, LEEF, Syslog and Windows Security logs.
Audit logs	Track administrative actions, configuration changes and compliance‑relevant operations. Common in cloud logs, database logs and Windows Event Logs.
Transaction logs	Represent business or database operations (payments, data writes, commits). Found in JSON logs, DB logs and application‑specific formats.
Performance logs	Metrics, latency, throughput, resource usage. Often emitted as JSON or structured Syslog.
Debug logs	Highly detailed internal information for troubleshooting. Common in JSON and application‑specific formats.

Centralizing logs in ServicePilot enables you to:

Quickly search your system or application logs.
Correlate logs with network traces, alerts and events.
Have a detailed history of each technical event for audits, troubleshooting or analysis of abnormal behavior.

Collecting logs from W3C sources

ServicePilot supports the ingestion of W3C format logs, used in particular by IIS web servers or some load balancers and proxies. These logs can include information on HTTP requests, status, processing times and source IP addresses. If the site is already instrumented with RUM, W3C logs can be enriched with customized request headers to enhance user tracking.

APM W3C

The apptrace-appservice-w3c package is designed to automatically collect W3C logs present on the server, according to a path defined during configuration.

In particular, it enables you to:

View incoming HTTP requests.
Analyze response times.
Identify application errors (4xx/5xx codes).

W3C logs data visualization

The standard dashboards for W3C Logs are available with consolidated or individual views under the DASHBOARDS section in AppTrace > appservice-w3c depending on the desired level of granularity.

Further interfaces allow for granular exploration of requests under the DASHBOARDS section in AppTrace > Applications, AppTrace > Requests or AppTrace > L7 Map.

Application observability

The pillars of observability

Distributed traces

What is application tracing?

Collecting APM traces and data

Instrumentation modes and languages

APM Trace data visualization

Network flows

What are host network flows?

Collecting network flow data

NetTrace data visualization

Real User Monitoring

What is Real User Monitoring?

What is RUM session replay?

Collecting RUM data

Activating session replay

RUM & session replay data visualization

Synthetic monitoring

What is synthetic monitoring?

Collecting synthetic monitoring data

ServicePilot webcheck - HTTP(S) checks

ServicePilot web-scenario - multi-step scenarios

Integration of external functional tests

Synthetic monitoring data visualization

Application logs

Collecting logs from W3C sources

W3C logs data visualization

onThisPage