Components, best practices, and next-gen capabilitiesRead More
Log aggregation is the process of collecting logs from multiple computing systems, parsing them and extracting structured data, and putting them together in a format that is easily searchable and explorable by modern data tools.
There are four common ways to aggregate logs – many log aggregation systems combine multiple methods.
A standard logging protocol. Network administrators can set up a Syslog server that receives logs from multiple systems, storing them in an efficient, condensed format which is easily queryable.
Log aggregators can directly read and process Syslog data.
Protocols like SNMP, Netflow and IPFIX allow network devices to provide standard information about their operations, which can be intercepted by the log aggregator, parsed and added to central log storage.
Software agents that run on network devices, capture log information, parse it and send it to a centralized aggregator component for storage and analysis.
Log aggregators can directly access network devices or computing systems, using an API or network protocol to directly receive logs. This approach requires custom integration for each data source.
Log processing is the art of taking raw system logs from multiple sources, identifying their structure or schema, and turning them into a consistent, standardized data source.
Each log has a repeating data format which includes data fields and values. However, the format varies between systems, even between different logs on the same system.
A log parser is a software component that can take a specific log format and convert it to structured data. Log aggregation software includes dozens or hundreds or parsers written to process logs for common systems.
Normalization merges events containing different data into a reduced format which contains common event attributes. Most logs capture the same basic information – time, network address, operation performed, etc.
Categorization involves adding meaning to events – identifying log data related to system events, authentication, local/remote operations, etc.
Log enrichment involves adding important information that can make the data more useful.
For example, if the original log contained IP addresses, but not actual physical locations of the users accessing a system, a log aggregator can use a geolocation data service to find out locations and add them to the data.
Modern networks generate huge volumes of log data. To effectively search and explore log data, there is need to create an index of common attributes across all log data.
Searches or data queries that use the index keys can be an order of magnitude faster, compared to a full scan of all log data.
Because of the massive volumes of logs, and their exponential growth, log storage is rapidly evolving. Historically, log aggregators would store logs in a centralized repository. Today, logs are increasingly stored on data lake technology, such as Amazon S3 or Hadoop.
Data lakes can support unlimited storage volumes with low incremental storage cost, and can provide access to the data via distributed processing engines like MapReduce, or modern high performance analytics tools.
Almost every computing system generates logs. Below are a few of the most common sources of log data.
An endpoint is a computing device within a network – such as a desktop, laptop, smartphone, server or workstation. Endpoints generate multiple logs, from different levels of their software stack – hardware, operating system, middleware and database, and applications. Endpoint logs are taken from the lower levels of the stack, and used to understand the status, activity and health of the endpoint device.
Network devices like routers, switches and load balancers are the backbone of network infrastructure. Their logs provide critical data about traffic flows, including destinations visited by internal users, sources of external traffic, traffic volumes, protocols used, and more. Routers typically transmit data via the Syslog format, and data can be captured and analyzed via your network’s Syslog servers.
Applications running on servers or end user devices generate and log events. The Windows operating system provides a centralized event log that collects startup, shutdown, heartbeat and run-time error events from running applications. In Linux, application log messages can be found in the /var/log folder. In addition, log aggregators can directly collect and parse logs from enterprise applications, such as email, web or database servers.
A new and growing source of log data is Internet of Things (IoT) connected devices. IoT devices may log their own activity and/or sensor data captured by the device. IoT visibility is a major challenge for most organizations, as many devices have no logging at all, or save log data to local file systems, limiting the ability to access or aggregate it. Advanced IoT deployments save log data to a central cloud service; many are adopting a new log collection protocol, syslog-ng, which focuses on portability and central log collection.
Many networks maintain a transparent proxy, providing visibility over traffic of internal users. Proxy server logs contain requests made by users and applications on a local network, as well as application or service requests made over the Internet, such as application updates. To be useful, proxies must be enforced across all, or at least critical segments, of user traffic, and measures must be in place to decrypt and interpret HTTPS traffic.
Common log formats: CSV, JSON, key value pair , Common Event Format (CEF)
5:39:55 → Time [Fname, Lname, name@company] → User Credentials Sign-in Failed → Authentication Event 18.104.22.168 → IP /app/office365 → App User Signed Into
MachineName → User’s host Message → The event is a Kerberos service ticket (user already authenticated and sending access request for specific service) TimeGenerated → Time of event TargetUserName → Username attempting to login TargetDomainName → Domain user attempted to login to ServiceName → Service user attempted to log into
CEF is an open log management standard that makes it easier to share security-related data from different network devices and applications. It also provides a common event log format, making it easier to collect and aggregate log data. CEF uses the syslog message format.
CEF:Version|Device Vendor|Device Product|Device Version|Signature
Bracket enclosing Trend Micro .. 3.5.4 → Uniquely identifies the sending device. No two products may use the same vendor-product pair. 600 → Unique identifier per event type, for example in IDS systems each signature or rule has a unique Signature ID 4 → Severity of the event from 1-10 Suser=Master.. → a collection of key-value pairs which allow the log entry to contain additional info, from an extensive Extension Dictionary including events like deviceAction, ApplicationProtocol, deviceHostName, destinationAdress and DestinationPort, or custom events.
Jan 18 11:07:53 dsmhost CEF:0|Trend Micro|Deep Security Manager|3.5.4|600|Administrator Signed In|4|suser=Master…
There is a wealth of information in log files that can help identify problems and patterns in production systems. Log monitoring involves scanning log files, searching for patterns, rules or inferred behavior that indicates important events, and triggering an alert sent to operations or security staff.
Log monitoring can help identify problems before they are experienced by users. It can uncover suspicious behavior that might represent an attack on organizational systems. It can also help record baseline behavior of devices, systems or users, in order to identify anomalies that require investigation.
Log aggregation and log monitoring is a central activity for security teams. Collecting log information from critical systems and security tools, and analyzing those logs, is the most common way to identify anomalous or suspicious events, which might represent a security incident.
The two basic concepts of security log management are events and incidents—an event is something that happens on a network on an endpoint device. One or more events can be identified as an incident—an attack, violation of security policies, unauthorized access, or change to data or systems without the owner’s consent.
In the security world, the primary system that aggregates logs, monitors them and generates alerts about possible security systems, is a Security Information and Event Management (SIEM) solution.
SIEM platforms aggregate historical log data and real-time alerts from security solutions and IT systems like email servers, web servers and authentication systems.
They analyze the data and establish relationships that help identify anomalies, vulnerabilities and incidents. The SIEM’s main focus is on security-related events such as suspicious logins, malware or escalation of privileges.
The SIEM’s goal is to identify which events has security significance and should be reviewed by a human analyst, and send notifications for those events. Modern SIEMs also provide extensive dashboards and data visualization tools, allowing analysts o actively seek data points that might indicate a security incident—known as threat hunting.
Traditionally, the SIEM used two techniques to generate alerts from log data: correlation rules, specifying a sequence of events that indicates an anomaly, which could represent a security threat, vulnerability or active security incident; and vulnerabilities and risk assessment, which involves scanning networks for known attack patterns and vulnerabilities.
The drawback of these older techniques is that they generate a lot of false positives, and are not successful at detecting new and unexpected event types
Advanced SIEMs use technology called User Event Behavioral Analytics (UEBA). UEBA leverages machine learning to look at patterns of human behavior, automatically establish baselines, and intelligently identify suspicious or anomalous behavior.
This can help detect risks that are unknown or difficult to define with correlation rules, such as insider threats, targeted attacks, fraud, and anomalies across long periods of time or across multiple organizational systems.
Traditionally, monitoring and security efforts focused on network traffic to identify threats. Today, there is a growing focus on endpoints, such as desktop computers, servers and mobile devices. Endpoints are frequently targeted by threat actors who can bypass traditional security measures—for example, a laptop forgotten on a train can be stolen by an attacker and used to penetrate organizational systems. Without careful monitoring of the laptop’s activity, this and similar attacks could go undetected.
The Windows operating system provides an event logging protocol that allows applications, and the operating system itself, to log important hardware and software events. The events can be viewed directly by an administrator using the Windows Event Viewer.
Which events are logged?
Events logged in Windows event logs include application installations, security management (see Windows Security Logs below), initial startup operations, and problems or errors. All these event types can have security significance, and should be monitored by log aggregation and monitoring tools.
Example of Windows Event Log
Warning 5/11/2018 10:29:47 AM Kernel-Event Tracing 1 Logging
The Windows Security Log is a part of the Windows Event Log framework. It contains security-related events specified by administrators using the system’s audit policy. Microsoft describes the Security Log as “Your Best and Last Defense” when investigating security breaches on Windows systems.
Which events are logged?
The following types of Windows log events can be defined as security events: account logon, account management, directory service access, logon, object access (for example, file access), policy change, privilege use, tracking of system processes, system events.
Unlike Windows and Linux, the iOS operating system does not log system and application events by default, with the exception of application crash reports. iOS 10.0 onwards offers a logging API that allows specific applications to log application events and store them to a centralized location on disk. Log messages can be viewed using the Console app of the log command-line tool.
Because iOS does not provide convenient remote access to logs, several third-party solutions have emerged that allow for remote collection and aggregation of iOS logs.
Linux logs record a timeline of events that occur in the Linux operating system and applications. Central system logs are stored in the /var/log directory, and logs for specific applications may be stored in the application folder, for example ‘~/.chrome/Crash Reports’ for Google Chrome.
Which events are logged?
There are Linux log files for system events, kernel, package managers, boot processes, Xorg, Apache, MySQL, and other common services. As in Windows, all these events could possibly have security significance.
Which are the most critical Linux logs to monitor?
Endpoint Detection and Response (EDR) technology helps to detect, investigate and mitigate security incidents on organizational endpoints. EDR is complementary to traditional endpoint tools such as antivirus, Data Loss Prevention (DLP) and SIEM. EDR technology provides visibility into events taking place on endpoints, including application access and activity, operating system operations, creation, modification, copying and movement of data, memory usage, and user access to predefined sensitive data.
EDR systems provide aggregated logs that allow security teams to analyze and explore events from across the enterprise endpoint portfolio.
Symantec Endpoint Protection is a security suite that includes intrusion prevention, firewall, and anti-malware. Endpoint Protection logs contains information about configuration changes, security-related activities such as virus detections, errors on specific endpoints, and traffic that enters and exits the endpoint.
Which events are logged?
Symantec Endpoint Protection log types include:
McAfee Endpoint Security provides centralized management for endpoint devices, anti-malware protection, application containment, web security, threat forensics and machine learning analysis for detection of unknown threats.
The solution allows you to set each endpoint device to one of three log levels: no logging, event logging, and debug logging. Logs are saved on the endpoints in the McAfee folder.
Which events are logged?
McAfee Endpoint Security saves several log files on each endpoint device:
Firewall logs are extremely valuable for security analysis, because they contain trails of almost all traffic flowing into and out of your network. If malicious activity is occurring, even if it cannot be detected by known malware or attack signatures, it will be captured by the firewall and can probably be seen by analyzing firewall logs for unusual behavior.
For example, when a zero-day virus infects computers on your network, even if it cannot be detected yet by antivirus software, firewall logs may show unusually high numbers of denied connections, or allowed connections, with suspicious remote hosts. A routine review of firewall logs can discover trojans or rootkits trying to connect to their command and control systems via IRC, over the firewall.
Cisco routers save logs in syslog format, and also allow logs to be viewed by the admin interface. Messages are tagged with message codes—for example, most denied connections have a message code in the 106001 to 106023 range. Most firewall devices do not have local storage space, so logs must be configured to be sent elsewhere—Cisco allows saving logs to a syslog server on the network, via SMTP, via console port, telnet, or several other options.
What log entries are important to analyze?
Check Point routers can save logs in syslog format, and also allow logs to be viewed over an admin interface. Check Point routers maintain a security log which saves events that are deemed to have security significance.
Categories of events saved to security log:
Severity levels in the Check Point security log:
Log management has always been complex, and is becoming more so with the proliferation of network devices, endpoints, microservices and cloud services, and exponentially increasing traffic and data volumes.
In a security environment, next-generation Security Information and Event Management (SIEM) solutions can help manage and extract value from security-relevant log events:
Exabeam is an example of a next-generation SIEM platform that provides these capabilities. It can pull together logs from enterprise systems and security tools and perform the complete log management process, including log collection and aggregation, log processing, log analysis using advanced analytics and UEBA technology, and alerting about security incidents.
If you’d like to see more content like this, visit the Exabeam Information Security Blog
Components, best practices, and next-gen capabilitiesRead More
How SIEMs are built, how they generate insights, and how they are changingRead More
SIEM under the hood - the anatomy of security events and system logsRead More
User and Entity Behavioral Analytics detects threats other tools can’t seeRead More
Beyond alerting and compliance - SIEMs for insider threats, threat hunting and IoTRead More
From correlation rules and attack signatures to automated detection via machine learningRead More
Security Automation and Orchestration (SOAR) - the future of incident responseRead More
A comprehensive guide to the modern SOC - SecOps and next-gen techRead More
Evaluation criteria, build vs. buy, cost considerations and complianceRead More
SIEM Essentials QuizRead More