Much of an organization's operational and security information can be derived from the log files generated by the company's servers, equipment and applications.
Potential sources of security logs are:
ServicePilot not only collects logs and events (Syslogs, Traps, W3C logs,...) but also offers several interfaces to speed up log processing and analysis: dashboards, PDF reports, alarms, map integrations, customizable interfaces, real-time and historical event loggers, etc.
Each new source or collector deployment such as the reception of Syslogs or Windows Events from new equipment is automatically integrated into the standard dashboards specific to each technology. Filters, calendaring and the creation of dashboards with custom widgets make it easy to segment what you are looking for and quickly identify anomalies in the mass of links that make up the IT security perimeter of a company.
The following screenshot shows a standard dashboard for Windows Events analysis with top n per server, allowing to Drill-Down to a particular Windows Event to investigate.
Locate your data sources and configure ServicePilot via YAML or web interface to send/receive logs and security event sources to manage logs in a unified way. It is difficult to juggle locally on several machines or to grep all over complex and distributed environments to understand what is happening.
Here are the typical locations of the logs:
To simplify analysis and subsequent correlation, a good method is to create a view (a kind of container or box containing disparate and/or homogeneous elements that you want to analyze) of the "Analyse-CVE-abc" or "SecurityRoutine-xyz" type.
In case these sources are already collected in ServicePilot, we can simply create a shortcut to this source or Object Search (automated search in logs that can perform many operations such as count, sum, etc.) to avoid duplication problems.
It is essential to be able to minimize "noise" by removing repetitive and routine entries from the log file after confirming that they are benign.
Through simple filters and queries, we can filter and narrow the amount of events to gradually minimize the "noise" and minor events that disrupt the analysis. Checkboxes or simple query filters can be used to reduce the amount of playback (especially for Windows events...).
Several steps and techniques make it possible to find the needle in the log stack, in order to easily understand the dependencies and impacts of the incident, whether it is as early as provisioning, through analyzes from dashboards, through event trays, or through machine learning queries in the ServicePilot big data search engine (example: Machine learning for the analysis of significant terms and anomalies in logs and events).
ServicePilot makes it easy for you and manages time stamps and time zone by indexing all data in UTC format and displaying the correct time zone based on the user settings of the ServicePilot web browser. No headache, it's made for you!
Recent changes, failures, errors, status changes, access and administration events and other unusual environmental events must be monitored.
Under Linux, you can search for many interesting keywords among the syslogs in order to detect them automatically:
Pre-built queries containing this type of search are available in standard, and allow not only to make simple tops, but also to see the same top with a machine learning algorithm to analyze significant terms for surface anomalies (Top with proportion raking).
Under Windows, event IDs are the main mechanisms for fast event skimming. Most of the events below are in the security log; many of them are only registered on the domain controller, and some are to be activated because they are not registered by default. However, it is easy to make tops on:
The screenshot below shows the tracking of failed connections on multiple Windows hosts over time.
ServicePilot also supports the analysis of Windows Sysmon Sysinternals logs, allowing you to record detailed Windows Events on process creations, network connections, registry events, file creations and many more.
On network devices, it is necessary to check incoming and outgoing activities. The examples below show extracts from Cisco ASA logs (%ASA); other devices have similar features.
The example below shows us a follow-up of successful vs. failed authentications of a Cisco VPN of remote workers.
On web servers, it is necessary to pay attention to many parameters and indicators in order to easily locate unusual activities among thousands or millions of requests:
The objective is to use the pre-integrated queries included in ServicePilot, to customize them according to our environment, to automatically generate PDF reports or detailed dashboards for in-depth analysis and correlation.
For example, we could build several generic reports for each type of data source and a very high level overall for a quick review, as well as several dashboards for quick access to detailed analysis.
Building a custom global dashboard monitoring three different sources (e. g. syslogs, Windows events, and Suricata alerts in Syslog format) and making it my home page is relatively simple with the drag and drop functions of the web interface. The construction of a report is based on the same principles and share your saved widgets and dashboard graphs.
Having centralized my sources and logs in a "SecurityRoutine01" view, I only have to apply the filter view: "SecurityRoutine01" to filter any request / dashboard or report to the necessary sources.
Having already created custom widgets targeting the resources I want, I don't need to filter events on a host, view, application or resource. I can also create new queries to extend my analysis to the application level (SharePoint, Exchange...) and middleware by monitoring key security events in Microsoft SQL Server, for example:
The use of the ServicePilot architecture and native Machine Learning features also improve the analysis or reduction of supervisory noise. ServicePilot has pre-fabricated queries for faster obtention and search for what is needed in each technology, to do for example significant term and anomalies analyzes in logs and events.
After saving all queries in the previous step for automatic export to dashboards and reports, we can use the calendar in the dashboards or generate PDF reports for the selected time period.
In the dashboard creation menu, we can easily divide a dashboard into 3 parts, each representing a log type, a performance metric, an activity overview, a security request, etc. according to its needs.
The calendar function in the dashboard view and cross-tab graphical mouse pointing makes it easy to capture the complete image and correlate events between IT log sources and "noise" to focus on impact analysis and heterogeneous log or metric correlation sources.
This can be done using dashboards, queries through the search engine or PDF reports.
To develop theories about what happened once the logs, events and sources have been confirmed and selected for post-legal analysis, we can construct a blank report model to describe the theory. This model can be built from well-known models of SANS incident rest reports or Pentest incident reports. These allow the process of analyzing security incidents to be unified and standardized. We can even try to create security incident documentation, to simplify the construction of the request and enrich my automated weekly reports with a new automatic security intelligence review.
Once the security incident is closed, we can automate the analysis of this incident with a query and include it in a weekly PDF report (e.g. every Monday at 8am a PDF Security Weekly Check is sent to the teams). This task can easily be automated for routine checks with scheduled PDF reports containing the result of the previous list of possible queries to detect abnormal activity in the entire IT stack.
It is also useful to analyze the impacts of security incidents on IT operational performance during the incident period, which can be easily achieved with dedicated dashboards and schedules. This makes it easy to answer questions such as: What was the impact of the abc security event on my application's response time? The following screenshot shows a DevSecOps vision to monitor an application in production.