Next-Gen UEBA for Reduced Complexity and Better Accuracy - Exabeam

Next-Gen UEBA for Reduced Complexity and Better Accuracy

April 13, 2022


Reading time
7 mins

Before today’s user and entity behavior analytics (UEBA), security information and event management (SIEM) analytics had been about the design of rules — fact-based rules and correlation rules. These rigid, static, and deterministic rules are point-in-time, offering little protection against insider threats since they don’t consider dynamic behaviors of users and entities. Since 2014, when Gartner first coined the term “UEBA”, rules have evolved to now incorporate the concept of normal and abnormal behaviors. How this works is: the system builds behavior profiles or states from historical activities of users and entities. Then, a rule can trigger when the current behavior does not match the profiles. For example, a behavior profile may be a histogram tracking login counts from past countries the user came from. A current login event coming from a country that is never or rarely seen in the profile triggers an alert. 

Next-Gen UEBA for Reduced Complexity and Better Accuracy
Histograms with different data types.

The state-of-the-art UEBA relies on a large collection of behavior-based rules to detect malicious activities. This is all good. However, these UEBA systems have many knobs and switches to control. The onus is on the user to design, tune, and maintain the rule content. Detection accuracy and analytics useability suffer when the rules are not configured and tuned optimally. 

Where do we go from here? If history is a guide in which other industries such as retail and banking have similarly evolved from rules, UEBA will evolve toward one that uses more machine learning that minimizes the human touch and makes it simpler to use and maintain, while achieving better accuracy.

In this article:

Challenges with today’s UEBA

To highlight some current useability and accuracy challenges of today’s UEBA:

  • Whether fact, correlation, or behavior-based rules, each is individually designed and configured. Each rule has its own set of carefully crafted conditions. This means any nuanced variation from a base rule would require the creation of a new rule. I’ll show examples shortly. This results in complex rule configuration, uncontrolled rule growth, and an ever-expanding, difficult-to-maintain rule library.
  • Rules — particularly those relating to behavior modeling — have many parameters to tune. Examples are the all-important risk score an analyst must manually assign when a rule triggers, as well as the convergence criteria controlling when the behavior profiles are deemed mature. Tuning these rules requires expertise. When not tuned correctly, false positives inevitably ensue.
  • Assigned rule risk scores are typically summed linearly to make up a session-level score for prioritization and presentation. Over a fixed set of log sources, as more rules are added over time, it’s not surprising to see session scores inflate (more chance to have scores added from more triggered rules), resulting in ever more high scoring sessions presented to the user.
  • Rule-based systems cannot learn from the case management data. Valuable user-labeled false negative and true positive events are not leveraged to further improve the system.

Where do we go from here? How do we continue to leverage behavior-based anomalies for threat detection but without the incurred complexities and false positives that come with the rule-based systems?

Next-gen UEBA 

Enter the next-gen UEBA. The aim now is to eliminate, as much as possible, the requisite human effort in rule design and parameter tuning, while making the system more accurate, the risk content configuration simpler, and the overall experience more friendly. Let’s sketch out how the next-gen UEBA streamlines the process and therefore addresses the above challenges.

Reducing the content design complexity

When an analyst designs what anomalous behaviors to capture in events, they first think of the indicators. For example, these are possible indicators to consider for malicious or anomalous user to asset access:

  1. Is this the first time the user accesses the asset?
  2. Is this the first time the user’s peer group accesses the asset?  
  3. Is the user a service account? 
  4. Is the asset privileged? 
  5. Is this a remote access event? 

They then decide what behavior scenarios to define out of these indicators, and turn each scenario into a rule. For example:

  • When A is true, trigger with a score of 5
  • When B is true, and E is a remote access event, trigger with a score of 7
  • When A is true and C is true, trigger with a score of 10
  • When A is true, B is true, and D is true, trigger with a score of 40

Since there can be many behavior scenarios from combining the indicators, it is not hard to see that given these five example indicators, there is potentially a large number of rules, one for each scenario. 

This configuration complexity from manually enumerating all the combinations of indicators and scores is difficult to manage. This begs the question: why not let the machine manage the combinations? The next-gen UEBA allows the user to focus on designing the small set of core indicators and lets the machine automatically construct their combinations. 

The first benefit is the immediate reduction of boilerplate code in configuration, as there is no more manual copying-and-pasting one rule to create another more nuanced rule. Simpler configuration means fewer potential configuration errors. 

The second benefit is better risk coverage. The machine does a more thorough job than humans can in assembling all scenarios from these indicators. Just like building with Lego® pieces, complex event risk scenarios are formed from elementary indicators, covering the same or greater range of rules that a human user could possibly define.

Eliminating manual score assignment 

In a rule-based system, the burden is on the user to set the rule scores. When assigning a rule score, the user must carefully weigh the criticality of the rule and guess the likely false positive rate the rule would generate in production, as well as gauging the relative scoring difference to other rules. This mental process requires experience and expertise. Setting the scores too high or too low for individual rules undermines the system’s accuracy. 

Can we let machines automatically assign a score to an event when the indicator A is true, B is true, but D is false? Yes. This is where the machine learning of the next-gen UEBA shines. The score is learned from the historical data. If the combination of indicators occurred rarely in the past, it will get a higher risk score; and vice versa. This is machine learning at work in scoring the outlying event. It takes out the guesswork of manual scoring assignments. The learning is continuous; the same combination of indicators may have a different risk score later. Compared to the manual score assignment, self-learned risk scores are adaptive and more accurate in reflecting the degrees of anomalies.

Better accuracy

Next-gen UEBA strives for better detection. There is no silver bullet, but it takes a multi-pronged approach. First, event-level risk must have a high signal-to-noise ratio. This means taking care to design anomaly indicators and behavior profiles. For example, what constitutes an “abnormal” behavior? Is it better to measure abnormality of a behavior based on the number of times it was seen, or based on how long ago it was last seen? Different choices give different levels of signal-to-noise ratio. Quality scoring at the event level provides a good foundation for downstream analytics.

Second, the next-gen UEBA is flexible in aggregating events for threat presentation. Rarely is a malicious activity scoped within a single anomalous event, or a fixed time boundary, rather, over multiple events. A graph-based approach provides a natural means to stitch events of risk across time, users, and devices.

The future evolution of UEBA

The current state of UEBA faces many challenges. Conventional rules are complex to maintain and difficult to tune. Less complexity, reduced tuning, and better accuracy are desired. Machine learning can take UEBA to the next level. But the road doesn’t end here. Risk systems always evolve, and more exciting UEBA technologies lie ahead.

Similar Posts

A Crash Course on Security Analytics — And How to Spot Fake UEBA From a Mile Away

Exabeam in Action: Stopping Lapsus$ in Their Tracks

Ransomware: Bigger, Better, and Still Going Strong

Recent Posts

Exabeam News Wrap-up – Week of September 19, 2022

Exabeam News Wrap-up – Week of September 12, 2022

The 4 Steps to a Phishing Investigation

See a world-class SIEM solution in action

Most reported breaches involved lost or stolen credentials. How can you keep pace?

Exabeam delivers SOC teams industry-leading analytics, patented anomaly detection, and Smart Timelines to help teams pinpoint the actions that lead to exploits.

Whether you need a SIEM replacement, a legacy SIEM modernization with XDR, Exabeam offers advanced, modular, and cloud-delivered TDIR.

Get a demo today!