Security Event Management Basics (What We’ve Seen Companies Are Not Doing)
If you are reading this blog, I am probably preaching to the choir when I say that monitoring logs is an extremely valuable and cost effective way to identify security issues in the network. What is not so obvious, however, is that purchasing a SIEM/Log Management solution is the first, not last, step in extracting the value hidden in these logs. There are several steps that must be taken, and constantly reviewed, to ensure that you will not be looking the wrong way when a breach occurs.
What to Log
First and foremost is the question of what to log. Since it is not possible to log everything on all servers, you should try to log the minimal set of events that will provide the most comprehensive visibility of security related activities on the network.
Easier said than done, but fortunately, since the release of Window 2008, Microsoft enables granular control of what to log, thus allowing to focus only on the events of interest, and to ignore much of the noise. In addition, the Audit Policy can be controlled via Group Policy Objects, so in theory, you can set these policies to all servers in the domain from a central location.
I recommend setting the following audit policies on Windows machines. ‘Critical Systems’ refer to a small number of systems that store or process very sensitive information, such as point-of-sale systems (POS) or the databases that store sensitive customer information.
|Audit Category||Subcategory*||Domain Controllers||Critical Systems||Member Servers and Workstations|
|Credential Validation||Success and Failure||Success and Failure||–|
|Kerberos Authentication Service||Success and Failure||Success and Failure||–|
|Kerberos Service Ticket Operations||Success and Failure||Success and Failure||–|
|Account Management||Security Group Management||Success and Failure||Success and Failure||Success and Failure|
|User Account Management||Success and Failure||Success and Failure||Success and Failure|
|Policy Change||Audit Policy Change||Success and Failure||Success and Failure||Success and Failure|
|Optional (based on capacity)|
|Process Tracking||Process Creation||Success||Success||–|
|Logon/Logoff||Other Logon/Logoff Events||Success||Success||Success|
|Object Access||Filtering Platform Connection||Success||Success||–|
|Detailed File Share||–||Success||–|
|System||Security System Extension||Success||Success||Success|
* In Windows 2003 systems, the entire category should be enabled.
Now that the logging policy is established, it is important to ensure you are getting these events in a timely manner. It will not do much good if the event that shows a breach has occurred will be analyzed after the damage has already been done.
I have seen security operations centers (SOC) in which events from critical domain controllers arrived to the SIEM up to 5 days after they were logged by the machine. Not only does this eliminate the possibility for any timely incident response, it also prevents analyzing events that do arrive on time in context with the late-coming events and vice versa. This context is sometimes the only indication of whether the events you are seeing represent a true or false alarm.
This is one of the things a CISO should care about most since it’s a direct indication of the SOCs capability to alert and respond. The delay for events from Domain Controllers and critical systems should be in seconds while the delay for all other systems should be no more than a few minutes.
Another time related issue is time zones synchronization. This is probably the most difficult issue to wrap ones mind around because there are so many factors to consider. It not uncommon to see events that happen in the “future” or inability to determine which event happened before the other.
When an event is logged at 3pm in Hong Kong and 3pm in California, there is a 15-hour difference between when these events actually happened. This difference has to be taken into account when these events are analyzed, otherwise it may lead to wrong conclusions. When considering event delays, as described above, this task is made even more difficult. The SIEM has to be aware of the time zone in which the event was logged, and adjust the time accordingly.
Aggregation and Parsing
Many SIEM and log management solutions collect raw events from devices using intermediary software (usually known as collectors/connectors/agents/etc.), which parse the important information from the raw logs and/or aggregate multiple events that contain the same information into a single aggregated event.
While these agents can be powerful tools that improve performance, they can also obfuscate important information, and should be used carefully. For example, aggregation may discard important values in the events and obscure the order in which they were logged. Similarly, parsing, if not done properly, can discard important information in events.
It is important to ensure that agents do not discard important information and that you have a good relationship with your vendor in case the agents require amendments.
Choosing to monitor security through log events is a natural and great first step to taking control of what is happening in your network. However, putting such a system in place is only the first step in this process. You have to continuously ensure that all the parts of the system are functioning as expected because when the day comes this will be the difference between getting to the crux of an incident versus explaining why it couldn’t have been done.
Want to see a user behavior intelligence solution that will use the settings above to find account takeover and user impersonation?