This 3-Part blog series will demonstrate how data analytics of a User Entity Behavior Analytics (UEBA) product is at work to address cyber threats. In concept, a UEBA system such as Exabeam’s monitors network entities’ behaviors in an enterprise and flags behaviors that deviate from the norm. While the benefits are understandable, there are many challenges. In this blog series, I’ll focus only on the data analytics part of the system that has proven to work well in the field for a large number of customers with different environments. Part I covers the statistical analysis the system. Parts II and III will talk about some machine learning applications.
At Exabeam, data analytics begins with Stateful User TrackingTM which allows us to organize user events into sessions. A session defines where a logical collection of events starts and ends; for example, all events from the time a user logs on to a machine to the time when she logs out. Various statistics and counts tracked in the system are organized and based on the notion of sessions. As such, sessions are the core informational units for learning and scoring, upon which Exabeam data analytics is built.
A well-tuned statistical analysis system is at the heart of the current anomaly detection product. Our security researchers define a collection of more than a hundred statistical indicators for users, assets, peer groups, applications, network locations etc. An anomaly is triggered based on a statistical model and is given an expert assigned risk score which encodes the critically important security knowledge. Without the encoded knowledge, any pure anomaly-based detection system based on unsupervised learning will suffer from a high false positive rate, rendering it impractical for field deployment. Combining expert knowledge and data analytics is particularly advantageous because it is intuitive and easy to use for analysts of all levels. Neither a purely expert-driven system nor a purely data-driven approach, this hybrid method has proven to work well in production.
The statistical modeling starts by profiling network entities’ historical activities. An example of a profile could be a user’s login counts to a set of devices, or a user’s volume of bytes copied over to a USB device. One of Exabeam’s outlier analysis tools is based on p-value for statistical hypothesis testing to flag whether the current activity is an anomaly. If so, an alert from this particular anomaly is weighted by an expert-assigned score. Sessions with highest scores are presented to analysts.
Types of data profiled can be either categorical or numerical in nature. An example of categorical data is tracking login counts for each asset to which a user has connected. An example of continuous numerical data is the number of bytes transferred from a device. Profiling for continuous numerical data has non-trivial implementations. One of the implementations we use is to organize or group numerical data points to a dynamic histogram with distinct clusters or bins.
To dynamically construct a histogram of numerical data, we use an unsupervised clustering algorithm. It first starts by placing each point into a single group, iteratively merging the two closest groups until convergence. The criteria for evaluating the clustering quality is based on silhouette coefficient which measures consistency within clusters of data. This clustering step must be done periodically to adapt to new data to ensure the fidelity of clusters.
Selecting the right analytics tool for activity profiling is only part of the equation. A challenge even before the use of analytics is in knowing, among the endless possibilities, the features to engineer and what statistical entities to track and compute. The designs and choices of the right statistical indicators are built upon the many years of experience our in-house security experts have gathered. The next challenge is in the actual implementation of analytics that scales, particularly when algorithms need to learn and score in real time. With hundreds of thousands of events streaming into the system every second, moving and shuffling the events to compute the statistics and counts while being subject to memory constraints is a real platform engineering challenge. This requires space-time tradeoff. A description of software engineering methods involved in implementing this and other similar algorithms of the Exabeam product is beyond the scope of this article.
In Part 2 of this blog series, I’ll talk about how Exabeam uses machine learning to derive contextual information to aid user behavior analytics.
You can also learn more here: https://www.exabeam.com/product/applications/