The Three Elements of Incident Response: Plan, Team, and Tools

The Three Elements of Incident Response: Plan, Team, and Tools

March 12, 2019

Pramod Borkar

Learn how an incident response plan is used to detect and respond to incidents before they become a major setback.

The primary objective of an incident response plan is to respond to incidents before they become a major setback. As the frequency and types of data breaches increase, the lack of an incident response plan can lead to longer recovery times, increased cost, and further damage to your information security effectiveness.

In this post you will learn about:

What is Incident Response?

Incident response is an approach to handling security breaches. The aim of incident response is to identify an attack, contain the damage, and eradicate the root cause of the incident. An incident can be defined as any breach of law, policy or unacceptable act that concerns information assets, such as networks, computers, or smartphones.

Why is incident response important?

When your organization responds to an incident quickly it can reduce losses, restore processes and services, and mitigate exploited vulnerabilities. An incident that is not effectively contained can lead to a data breach with catastrophic consequences. Incident response provides this first line of defense against security incidents, and in the long term, helps establish a set of best practices to prevent breaches before they happen.

The Three Elements of Incident Response Management

Optimal management of incident response should include:

1. A comprehensive plan
An incident response plan should prepare your team to deal with threats, indicate how to isolate incidents and identify their severity, how to stop the attack and eradicate the underlying cause, how to recover production systems, and how to conduct a post-mortem analysis to prevent future attacks. Learn more about incident response plans below.

2. The right people in place
Recruit the following roles for your incident response team: incident response manager, security analyst, IT engineer, threat researcher, legal representative, corporate communications, human resources, risk management, C-level executives, and external security forensic experts. Let all employees know what their responsibilities will be in the event of an attack. Learn more about the incident response team below.

3. Tools
Many vendors offer tools which handle security incidents on a large scale, instead of investigating one issue at a time. These tools analyze, alert about, and can even help remediate security events which could be missed due to insufficient internal resources.

Incident response tools work alongside current security measures. They obtain information for response via Netflow, system logs, endpoint alerts, and identity systems to assess security-related anomalies in the network. These tools can investigate threats including:

  • Malware infections
  • Password attacks
  • Phishing
  • Data leakage
  • Abuse of privileges
  • Other insider threats

Element #1: Six-Step Incident Response Plan

The Computer Security Incident Response Team (CSIRT) carries out the incident response plan. The incident response team includes IT staff with some security training or full-time security staff. These individuals analyze information about an incident and respond. They respond to two types of incidents: public and organizational.

Public incidents affect an entire community: for example terrorism, natural disasters, large-scale chemical spills, and epidemics. Organizational incidents are confined to a single organization. They may be physical, such as a bomb threat, or computer incidents, such as accidental exposure, theft of sensitive data, or exposure of trade secrets.

The incident response team also communicates with stakeholders within the organization, and external groups such as press, legal counsel, affected customers, and law enforcement. The SANS Institute’s Incident Handlers Handbook defines a six-step process for handling security incidents.

1. Preparation

Here are steps your incident response team should take to prepare for cybersecurity incidents:

  • Develop policies to implement in the event of a cyber attack
  • Review security policy and conduct a risk assessment
  • Prioritize security issues, know your most valuable assets and concentrate on critical security incidents
  • Develop a communication plan
  • Outline the roles, responsibilities, and procedures of your team
  • Establish a corporate security policy
  • Recruit and train team members, ensure they have access to relevant systems
  • Ensure team members have access to relevant technologies and tools

2. Identification

Decide what criteria calls the team into action. A few examples of security incidents are detection of malware on corporate systems, a phishing attack, or a denial of service attack. A cumulative set of events could call a plan into action: for example, an unusual upload to a cloud storage site and an abnormal access alert in the same few hours.

IT systems gather events from monitoring tools, log files, error messages, firewalls, and intrusion detection systems. This data should be analyzed by automated tools and security analysts to decide if anomalous events represent security incidents.

When an incident is isolated it should be alerted to the incident response team. Team members coordinate the appropriate response to the incident:

  • Identify and assess the incident and gather evidence.
  • Decide on the severity and type of the incident and escalate if necessary.
  • Document actions taken, addressing “who, what, where, why, and how.” This information may be used later as evidence if the incident reaches a court of law.

3. Containment

Once your team isolates a security incident, the aim is to stop further damage. This includes:

  • Short term containment—an instant response, so the threat doesn’t cause further damage. This can include taking down production servers which have been hacked or isolating a network segment that is under attack.
  • System Backup—you should back up all affected systems before you wipe and reimage them to take a forensic image. A forensic image is a bit-for-bit copy of a hard disk, or a specific disk partition. Disk images are created after an incident to maintain the state of a disk at a specific point in time and thus provide a static ‘snapshot’, which you can use as evidence of the security incident, and to investigate how the system was compromised.
  • Long term containment—temporarily fix affected systems so they can be used in production. While this takes place, rebuild clean systems so you can bring them online in the recovery stage. Take measures to prevent the incident from recurring or escalating: install security patches on affected and associated systems, remove accounts and backdoors created by attackers, alter firewall rules and change the routes to null route the attacker address, etc.

4. Eradication

Contain the threat and restore initial systems to their initial state, or close to it. The team should isolate the root cause of the attack, remove threats and malware, and identify and mitigate vulnerabilities that were exploited to stop future attacks. These steps may change the configuration of the organization. The aim is to make changes while minimizing the effect on the operations of the organization. You can achieve this by stopping the bleeding and limiting the amount of data that is exposed.

This is done as follows:

  • Identify and fix all affected hosts, including hosts inside and outside your organization
  • Isolate the root of the attack to remove all instances of the software
  • Conduct malware analysis to determine the extent of the damage
  • See if the attacker has reacted to your actions
  • Anticipate a different type of attack and create a response
  • Allow time to make sure the network is secure and that there is no further activity from the attacker

Ensure your team has removed malicious content and checked that the affected systems are clean. For example, if the attacker used a vulnerability, it should be patched, or if an attacker exploited a weak authentication mechanism it should be replaced with strong authentication.

5. Recovery

Ensure that affected systems are not in danger and can be restored to working condition. The purpose of this phase is to bring affected systems back into the production environment
carefully, to ensure they will not lead to another incident. Ensure another incident doesn’t occur by restoring systems from clean backups, replacing compromised files with clean versions, rebuilding systems from scratch, installing patches, changing passwords and reinforcing network perimeter security (boundary router access control lists, firewall rulesets, etc).

Consider how long you need to monitor the network system, and how to verify that the affected systems are functioning normally. Calculate the cost of the breach and associated damages.

6. Lessons Learned

The incident response team and partners should communicate to improve future processes. Complete documentation that couldn’t be prepared during the response process. The team should identify how the incident was managed and eradicated.

See what actions were taken to recover the attacked system, the areas where the response team needs improvement, and the areas where they were effective. Reports on lessons learned provide a clear review of the entire incident and can be used in meetings, as benchmarks for comparison or as training information for new incident response team members.

Element #2: The Incident Response Team

To prepare for and attend to incidents, you should form a centralized incident response team, responsible for identifying security breaches and taking responsive actions. The team should include:

  • Incident response manager (team leader)—coordinates all team actions and ensures the team focuses on minimizing damages and recovering quickly. Prioritizes actions during the isolation, analysis, and containment of an incident. Oversees all actions and guides the team during high severity incidents.
  • Security analysts—the manager is assisted by a team of security analysts who work across departments to isolate and rectify flaws in the organization’s security systems, solutions, and applications. They recommend specific measures to improve the overall security posture.
  • Lead investigator—isolates root cause, analyzes all evidence, manages other security analysts and conducts rapid system and service recovery.
  • Threat researchers—provide the context of an incident and threat intelligence. They use this information and records of previous incidents to create a database of internal intelligence. On many security teams, threat researchers are gradually replaced by automated threat intelligence tools.
  • Communications lead—communicates with all audiences inside and outside the company, including management, internal stakeholders, legal, press, and customers.
  • Documentation and timeline lead—documents team investigation, discovery and recovery efforts. And, creates a timeline for each stage of the incident. Next-generation Security Information and Event Management (SIEM) systems are able to generate documentation and incident timelines automatically. For example, see the Exabeam Advanced Analytics module offered by the Exabeam Security Management Platform.
  • HR/legal representation—an incident could develop into criminal charges. Thus you should have HR and legal guidance.

Goals of the Incident Response Team

The goal of the incident response team is to coordinate team members and resources during a cyber incident to minimize impact and quickly restore operations. This includes:

  • Analysis—document the extent, priority, and impact of a breach to see which assets were affected and if the incident requires attention.
  • Reporting—tell team members of reporting procedures. Gather relevant trending data to show the importance of the incident response team.
  • Response—explore root causes, record findings and carry out recovery strategies and communicate the status of your organization to team members.

In modern Security Operations Centers (SOCs), advanced analytics plays an important role in identifying and investigating incidents. User and Entity Behavioral Analytics (UEBA) technology if used by many security teams to establish behavioral baselines of users or IT systems, and automatically identify anomalous behavior. This makes it much easier to security staff to identify events that might constitute a security incident.

Read our in-depth blog post: Why UEBA Should Be an Essential Part of Incident Response.

Which Qualities Should You Look for When Selecting Incident Response Team Members?

When assembling an incident response team consider:

  • Availability—an incident response team should be available around the clock all days of the year. You may also need staff members to be physically on site during an incident, so choosing staff that live close to the office is an advantage.
  • Virtual or volunteer team members—you may lack the resources to assign full-time responsibilities to all team members. Consider having some members form a ‘virtual’ incident response team. These team members can be called upon when an emergency occurs.
  • Effective advocate or executive sponsor—a person at the level of a CISO who can communicate the effect of an incident to other executive members. This individual should also ensure the incident response team receives an appropriate budget and maintains the authority to respond quickly during a crisis.
  • Monitor and bolster employee and team morale—your team may find it difficult to be on call all the time. They may lose focus and motivation. You can prevent staff burnout by granting opportunities for growth, learning and team building. You can also outsource some activities, which can reduce the workload and stress of the in-house team.
  • Diversity—recruit technically diverse teams. You cannot expect team members to be experts in all areas of incident response. It is important to determine what skill gaps exist and to hire individuals who fill that gap.

Five Tips For Incident Response Team Members

1. Isolate exceptions
Technology alone cannot successfully detect security breaches. You should also rely on human insight. Following are a few conditions to watch for daily:

  • Traffic anomalies—sensitive connections and servers used internally will typically have a stable traffic volume. If you notice a sudden increase in traffic, take notice.
  • Accessing accounts without permission—privileged or administrator accounts have access to more information and systems than normal employees. However, employees tend to be the easiest entry point for cybercrime. Closely monitor privileged accounts and watch for privilege escalation on normal user accounts.
  • Excessive consumption and suspicious files—if you see an increase in the performance of the memory or hard drives of your company, it could be that someone is illegally accessing them or leaking data.

Modern security tools such as User and Entity Behavioral Analytics (UEBA) automate these processes and can identify anomalies in user behavior or file access automatically. This provides much better coverage of possible security incidents and saves time for security teams. For example, see the Entity Analytics module, a part of Exabeam’s next-generation SIEM platform.

2. Use a centralized approach
Gather information from security tools and IT systems, and keep it in a central location, such as a SIEM system. Use this information to create an incident timeline, and conduct an investigation of the incident with all relevant data points in one place.

You can also use a centralized approach to allow for a quick automated response. Use data from security tools, apply advanced analytics and orchestrate automated responses on systems like firewalls and email servers, using technology like Security Orchestration, Automation, and Response (SOAR).

3. Assert, don’t assume
Don’t conduct an investigation based on the assumption that an event or incident exists. Instead of making assumptions, make assertions, based on a question that you can evaluate and verify. For example “If I’ve noted alert X on system Y, I should also see event Z occur in close proximity.”

Create your assertions based on your experience administering systems, writing software, configuring networks, building systems, etc., imagining systems and processes from the attacker’s eyes.

4. Eliminate impossible events
You may not know exactly what you are looking for. On these occasions eliminate occurrences that can be logically explained. You will then be left with the events that have no clear explanation.

For example:

  • Unexplained inconsistencies or redundancies in your code
  • Issues with accessing management functions or administrative logins
  • Unexplained changes in volume of traffic (e.g., drastic drop)
  • Unexplained changes in the content, layout, or design of your site
  • Performance problems affecting the accessibility and availability of your website

5. Take post-incident measures
Continue monitoring your systems for any unusual behavior to ensure the intruder has not returned. Watch for new incidents and conduct a post-incident review to isolate any problems experienced during the execution of the incident response plan.

Incident Response Tools

Incident response tool typesWhy you need themTool examples
SIEMGathers and aggregates log data created in the technology infrastructure of the organization, including applications, host systems, network and security devices (e.g., antivirus filters and firewalls). Provides reports on security-related incidents, including malware activity and logins. It also sends alerts if the activity conflicts with existing rule sets, indicating a security issue.Exabeam Security Management Platform (SMP) (including Data Lake, UEBA, Incident Responder), QRadar, USM, ESM
Intrusion Detection Systems (IDS) — Network & Host-basedUses baselines or attack signatures to issue an alert when suspicious behavior or known attacks take place on a server, a host-based intrusion detection system (HIDS), or a network-based intrusion detection system (NIDS).Snort, Suricata, BroIDS, OSSEC
Netflow AnalyzersLooks at actual traffic across border gateways and within a network. Netflow is used to track a specific thread of activity, to see what protocols are in use on your network, or to see which assets are communicating between themselves.ntop, NfSen, Nfdump
Vulnerability ScannersIsolates potential areas of risk, assesses the attack surface area of your organization for known weaknesses, and provides instructions for remediation. Vulnerabilities may be caused by misconfiguration, bugs in your own applications, or usage of third party components that can be exploited by attackers.OpenVAS
Availability MonitoringThe aim of incident response is to limit downtime. A service or application outage can be the initial sign of an incident in progress. Availability monitoring stops adverse situations by studying the uptime of infrastructure components, including apps and servers. It tells the webmaster of issues before they impact the organization.Nagios
Web ProxiesControls access to websites and logs what is being connected. Many threats operate over HTTP, including being able to log into the remote IP address. The HTTP connection can also be essential for forensics and threat tracking.Squid Proxy, IPFire

Improving Incident Response Via Orchestration and Automation

One of the key steps in incident response is automatically eliminating false positives (events that are not really security incidents), and stitching together the event timeline to quickly understand what is happening and how to respond.

Exabeam offers a next-generation Security Information and Event Management (SIEM) that provides Smart Timelines, automatically stitching together both normal and abnormal behaviors. This helps investigators accurately pinpoint a series of anomalous events, along with its associated assets, users, and risk reasons, all attached to a single timeline.

This automatic packaging of events into an incident timeline saves a lot of time for investigators, and helps them mitigate security incidents faster, significantly lowering the mean time to respond (MTTR).

What metrics are needed by SOC Analysts for effective incident response?

  • Mean Time to Detect (MTTD)—the effectiveness of your detection solution: Is it detecting most alerts or are the majority reported by users and system administrators? If your security operations team and their tools are not the greatest source of security alerts, you have an issue.
  • Detection accuracy/false positive rates—the percentage of alerts that, upon investigation, are revealed to not be valid threats. False positives reduce a security team’s confidence in its tools and draws attention away from serious underlying problems. False positive feedback loops should be included in any incident management process, but enterprises must guard against becoming too lenient; the only thing worse than a false positive is a false negative in which a serious threat is overlooked.
  • Mean Time to Respond/Repair (MTTR)—the time it takes to see a security concern, identify the impact, determine the course of action and implement it. These numbers can vary widely but over time trends will appear, providing useful insight about where you need to invest for additional protection, remediation and automation capabilities.



Learn More About Incident Response


The Complete Guide to CSIRT Organization: How to Build an Incident Response Team

A computer security incident response team (CSIRT) can help mitigate the impact of security threats to any organization. As cyber threats grow in number and sophistication, building a security team dedicated to incident response (IR) is a necessary reality.

Read more: The Complete Guide to CSIRT Organization: How to Build an Incident Response Team


10 Best Practices for Creating an Effective Computer Security Incident Response Team (CSIRT)

In many organizations, a computer security incident response team (CSIRT) has become essential to deal with the growing number and increasing sophistication of cyber threats. Unlike a security operations center (SOC) —a dedicated group with the tools to defend networks, servers, and other IT infrastructure—a CSIRT is a cross-functional team that bands together to respond to security incidents. Some members may be full-time, while others are only called in as needed.

Read more: 10 Best Practices for Creating an Effective Computer Security Incident Response Team (CSIRT)


How to Quickly Deploy an Effective Incident Response Policy

Almost every cybersecurity leader senses the urgent need to prepare for a cyberattack. If you haven’t already, most likely you’ll want to deploy an effective incident response policy soon, before an attack results in a breach or other serious consequences. Creating an effective incident response policy (which establishes processes and procedures based on best practices) helps ensure a timely, effective, and orderly response to a security event. In this blog, you’ll learn how to jumpstart the foundation of a good incident response policy that you can refine later to meet your organization’s unique needs.

Read more: How to Quickly Deploy an Effective Incident Response Policy


Incident Response Plan 101: How to Build One, Templates and Examples

An incident response plan is a set of tools and procedures that your security team can use to identify, eliminate, and recover from cybersecurity threats. It is designed to help your team respond quickly and uniformly against any type of external threat.

Read more: Incident Response Plan 101: How to Build One, Templates and Examples


IT Security: What You Should Know

Cyber attacks and insider threats have rapidly become more common, creative and dangerous. Many of these attacks are carried by threat actors who attempt to infiltrate the organizational network and gain access to sensitive data, which they can steal or damage. For this reason, the Information Technology (IT) team is one of the most critical components in the Security Operations Center (SOC) of any organization.

Read more: IT Security: What You Should Know


Incident Response Steps: 6 Steps for Responding to Security Incidents

When a security incident occurs, every second matters. Malware infections rapidly spread, ransomware can cause catastrophic damage, and compromised accounts can be used for privilege escalation, leading attackers to more sensitive assets.

Whatever the size of your organization, you should have a trained incident response team tasked with taking immediate action when incidents happen. Read on to learn a six-step process that can help your incident responders take action faster and more effectively when the alarm goes off.

Read more: Incident Response Steps: 6 Steps for Responding to Security Incidents


Beat Cyber Threats with Security Automation

It is becoming increasingly difficult to prevent and mitigate cyber attacks as they are more numerous and sophisticated. Security teams often have no way to effectively manage the thousands of alerts generated by disparate security tools. To investigate these potential threats, analysts must also complete manual, repetitive tasks. Combined with the strain of insufficient time and headcount, many organizations simply cannot cope with the volume of security work.

Read more: Beat Cyber Threats with Security Automation


IPS Security: How Active Security Saves Time and Stops Attacks in their Tracks

An intrusion prevention system (IPS) is a network security technology that monitors network traffic to detect anomalies in traffic flow. IPS security systems intercept network traffic and can quickly prevent malicious activity by dropping packets or resetting connections.

Read more: IPS Security: How Active Security Saves Time and Stops Attacks in their Tracks


See our Additional Guides on Information Security

For more in-depth guides on additional information security topics, see below:


Cybersecurity Threats Guide

Cybersecurity threats are intentional and malicious efforts by an organization or an individual to breach the systems of another organization or individual.

See top articles in our cybersecurity threats guide


SIEM Security Guide

SIEM security refers to the integration of SIEM with security tools, network monitoring tools, performance monitoring tools, critical servers and endpoints, and other IT systems.

See top articles in our siem security guide


User and entity behavior analytics Guide

UEBA stands for User and Entity Behavior Analytics which is a category of cybersecurity tools that analyze user behavior, and apply advanced analytics to detect anomalies.

See top articles in our User and Entity Behavior Analytics guide


Insider Threat Guide

An insider threat is a malicious activity against an organization that comes from users with legitimate access to an organization’s network, applications or databases.

See top articles in our insider threat guide


Security Operations Centers Guide

A security operations center (SOC) is traditionally a physical facility with an organization, which houses an information security team.

See top articles in our security operations center guide


DLP Guide

DLP is an approach that seeks to protect business information. It prevents end-users from moving key information outside the network.

See top articles in our DLP guide


Regulatory Compliance Guide

See top articles in our regulatory compliance guide

Recent Incident Response Articles

7 Detection Tips for the Log4j2 Vulnerability

Read More

You’ve Suffered a Breach … Now What?

Read More

Turnkey Playbooks Now Available for Exabeam Customers

Read More

EDR vs EPP: What is the Difference?

Read More

Securing Your Remote Workforce, Part 3: How to Detect Malware in the Guise of Productivity Tools

Read More

Recent Information Security Articles

7 Detection Tips for the Log4j2 Vulnerability

Read More

New CISO? 5 Things to Achieve In Your First 90 Days

Read More

5 Security Questions to Consider this Holiday Season

Read More

Our Customers Have Spoken: Exabeam named a 2021 Gartner Peer Insights™ Customers’ Choice for SIEM

Read More

What Is XDR? Transforming Threat Detection and Response

Read More