Skip to main content

data science

Sharpening First-Time Access Alert for Insider Threat Detection

Residents participating in a neighborhood crime watch look out for signs of suspicious activity.  A new car parked on the street is probably the first thing to register in a resident’s mind.  Other hints like the time of day, what the driver carries, or how he loiters around all add up before one decides to call the police.  A User Behavior Analytics (UBA) system works much the same way, with various statistical indicators jointly working[…]

Read more

Topics: data science

Anomalous User Activity Detection in Enterprise Multi-Source Logs

Network users’ activities generate events every day.  Logged events collected from multiple sources are valuable for user activity profiling and anomaly detection.  A good analytics use case for insider threat detection is to see if a user’s collection of events today is anomalous to her historical daily collections of events.  In an earlier blog, I highlighted a method to address this use case that leverages distributed computing built on HDFS and Apache Spark.  In this[…]

Read more

Topics: data science

Account Resolution via Market Basket Analysis

Machine learning and statistical analysis have many practical applications in the detection of malicious user and entities as part of  User & Entity Behavior Analytics (UEBA) solutions.  Threat detection typically garners attention, this is as true on the show floor of security conferences, as it is for the text of marketing material.  Equally important, although less mentioned, is the application of machine learning for context estimation. Contextual information such as whether the machine is a[…]

Read more

Topics: data science

User Behavior Anomaly Detection Meets Distributed Computing

User Entity Behavior Analytics (UEBA) analyzes log data from different sources in order to find anomalies in users’ or entities’ behaviors. Depending on enterprise sizes and available log sources, data feeds can range from tens of gigabytes to terabytes a day. Typically, we need 30 days, if not more, to build proper behavior profiles. This calls for an analytics platform that is capable of ingesting and processing this volume of data. In this blog, I[…]

Read more

Topics: data science

Too Many Alerts… Just Give Me the Interesting Ones!

Security analysts often wrestle with the high volume of alerts generated from security systems and much like the protagonist in The Boy Who Cried Wolf, many alerts tend to be ignored. Human analysts quickly learn to ignore repeated alerts in order to focus on the interesting ones.  Learning to screen out repeated alerts as false positives allows analysts to focus their finite time where it matters most. A natural question, then, is whether we can[…]

Read more

Topics: data science, SECURITY

Ransomworm: Don’t Cry – Act.

WannaCry

In July last year, we released our research report on the Anatomy of a Ransomware attack in which we looked into both the financial model of ransomware and then detection as it unfolds. Due to the recent WannaCry ransomware craze, we think it’s time to revisit. When we addressed ransomware last year, we made a significant comment about the ever-evolving nature of malicious software. We predicted that in the near future (evidently now) ransomware will move[…]

Read more

Topics: data science, ransomware, SECURITY, SIEM, Uncategorized

A Machine Learning Study on Phishing URL Detection

Many network attack vectors start with a link to a phishing URL. A carefully crafted email containing the malicious link is sent to an unsuspecting employee. Once he or she clicks on or responds to the phishing URL, the cycle of information loss and damage begins. It would then seem highly desirable to nip the problem early by identifying and alerting on these malicious links. In this blog, I’ll share some research notes here on[…]

Read more

Topics: data science, SECURITY

First-time Access to an Asset - Is it Risky or Not?: A Machine Learning Question

Looking for outliers or something different from the baseline is a typical detection strategy in user and entity behavior analytics (UEBA). One example is a user’s first-time access to an asset such as a server, a device or an application. The logic is sound and is often used as an example in the press for behavior-based analytics. However, it is an open secret among the analytics practitioners that alerts of this type has a high[…]

Read more

Topics: data science, SECURITY

The World Has Changed; Shouldn’t Your Security Change, Too?

From day one, Exabeam had a vision for something better than today’s SIEM solutions. We felt these products were fundamentally broken: SIEM log management was built on old, proprietary technology and was (over)priced by the byte; SIEM correlation rules were a mess and ineffective, and they caused more work for analysts than they eliminated. SIEM was broken and the opportunity to make something massively better was clear. Our first step was to win the UEBA[…]

Read more

Topics: CUSTOMERS, data science, SECURITY

A User and Entity Behavior Analytics Scoring System Explained

How risk assessment for UEBA (user entity behavior analytics) works is not unlike how humans assess risk in our surrounding environment. When in an unfamiliar setting, our brain constantly takes in data regarding objects, sound, temperature, etc. and weighs different sensory evidence against past learned patterns to determine if and what present risk is before us. A UEBA system works in a similar manner. Data from different log sources, such as Windows AD, VPN, database,[…]

Read more

Topics: data science, SECURITY
2017