InfoSec Trends › MITRE Publishes Domain Generation Algorithm T1483 in the ATT&CK Framework

MITRE Publishes Domain Generation Algorithm T1483 in the ATT&CK Framework

Published
May 02, 2019

Reading time
4 mins

MITRE released a significant update to the MITRE ATT&CK framework which included several new and updated techniques to mitigate cybersecurity threats. One of those techniques, domain generation algorithms (DGA) was submitted by our research team. We are excited to be contributors and will explain this technique in more detail.

Domain Generation Algorithm (T1483)

Introduction

The domain generation algorithm has remained a main source of communication for malware in the past 10 years. DGAs are designed to generate quick random seeds such as dictionary words, DWORD values, random digits, gibberish strings (hcbhjbdjbjhsb.ru) as domains which can be used to provide instructions for malware to exfiltrate data, provide updates and execute commands on a system remotely. Earlier families of malware used a static list of IP/domains which were eventually blocked by defenders. Attackers now write sophisticated DGA codes to circumvent defenders and draft thousands of DGAs, of which only a few have true instructions for command and control, to make their connection persist over the long term and stay resilient against enforcement actions. Malware like Kraken, Conficker, Murofet and Chopstick showcase DGAs whose attributes vary from date dependent, static and dynamic seeds.

Figure 1. An example of how DGA exfiltrates data from a target
Detection

Detecting dynamically generated domains can be challenging due to the rapid rotation of DGA seeds, constantly evolving malware families, sinkhole awareness, and the complexity of the DGA algorithm.

This makes signature-based detection to these signals irrelevant and requires a machine learning approach to detect those efficiently.

Using machine learning, DGA becomes a solvable problem. Some may be familiar with n-grams from natural language processing, where analysts count the frequency of how often words follow each other in normal speech or writing. Similarly, n-grams can be used to analyze the words in a domain name. If the words in a suspected domain name never follow each other in common use on the internet, they then have a high probability of being random.

One approach is to segment the domains into substrings with the size of “n”. Each substring of length n is called a gram. The larger the value of n, the smaller the number of substrings and vice versa as can be seen in the figure below. According to research 3, 4, 5 have the best accuracy when predicting randomness. For example, a 3-gram approach for word “youarepwned” would be you, oua, uar, epw, pwn, wne, ned. We, therefore, test the substrings (you, uar, epw) for randomness by excluding the top 1 million ranked domains such as the Alexa top million websites for example. We then check domains with high randomness against CDN whitelists and if the domain is not present, we check for a threshold value. If the level of randomness is higher than the threshold, it is deemed as a DGA-generated domain. This helps us increase the chance of detection and predict the range of random domains that may be generated.

Figure 2. An example of 2-gram and 3-gram breakdowns for “youarepwned” showing predictable randomness in domain generation

We apply the same method for second- and third-level domains to detect the occurrence of the word, for example, xuxu(dot)youarepwned(dot)net). Our behavior-based analytics approach mixed with deep learning helps detect DGA before it causes any damage. In a previous post, our Chief Data Scientist Derek Lin discusses behavioral modeling for DGAs using a random forest model. For example, in the case of ransomware, the attacker will encrypt the files and request encryption keys and send sensitive data. Any malicious request in addition to a random DGA will provide substantial evidence for abnormal activity that can be further investigated. Inputs include logs that contain domain access information, such as DNS NX domain logs, DNS request/response logs, WHOIS information, passive DNS traffic, proxy traffic and EDR activity.

Figure 3: Exabeam Advanced Analytics detects a likely DGA domain krbsectyxfpxsofe.ru.Conclusion

Signature-based detection is not considered an effective measure against DGAs because of the rapid changes in the algorithm. In addition to techniques like entropy change, frequency analysis and Markov chains Exabeam provides extensive detection techniques for behavior analytics using n-gram and machine learning. With the evolution of DGA techniques, it will be challenging to predict an adversary’s action and underscores the need to better prepare ourselves by sharing intelligence and working collaboratively.

Cookie	Duration	Description
cookielawinfo-checbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

r-tec DE

Featured Data Sheet

Exabeam Fusion

Featured Solution Brief

Exabeam Fusion on Google Cloud

Featured Resource

Exabeam Fusion on Google Cloud

r-tec DE

r-tec DE

Featured Data Sheet

Exabeam Fusion

Featured Solution Brief

Exabeam Fusion on Google Cloud

Featured Resource

Exabeam Fusion on Google Cloud

r-tec DE

MITRE Publishes Domain Generation Algorithm T1483 in the ATT&CK Framework

Domain Generation Algorithm (T1483)

Further reading

Similar Posts

Recent Posts

Stay Informed

See a world-class SIEM solution in action