The Three Elements of Incident Response: Plan, Team, and Tools

Incident Response: 6 Steps, Technologies, and Tips

May 02, 2022


Reading time
20 mins

Learn how an incident response plan is used to detect and respond to incidents before they cause major damage.

What is incident response?

Incident response is an approach to handling security breaches. The aim of incident response is to identify an attack, contain the damage, and eradicate the root cause of the incident. An incident can be defined as any breach of law, policy, or unacceptable act that concerns information assets, such as networks, computers, or smartphones.

As the frequency and types of data breaches increase, the lack of an incident response plan can lead to longer recovery times, increased cost, and further damage to your information security effectiveness. This makes incident response a critical activity for any security organization.

In this article:

Why is incident response important?

When your organization responds to an incident quickly, it can reduce losses, restore processes and services, and mitigate exploited vulnerabilities. An incident that is not effectively contained can lead to a data breach with catastrophic consequences. Incident response provides this first line of defense against security incidents, and in the long term, helps establish a set of best practices to prevent breaches before they happen.

If you fail to address an incident in time, it can escalate into a more serious issue, causing significant damage such as data loss, system crashes, and expensive remediation. Effective incident response stops an attack in its tracks and can help reduce the risk posed by future incidents.

A solid incident response plan helps prepare your organization for both known and unknown risks. Reliable incident response procedures will allow you to identify security incidents immediately when they occur and implement best practices to block further intrusion. Incident response is essential for maintaining business continuity and protecting your sensitive data.

Your response strategy should anticipate a broad range of incidents. Even simpler incidents can impact your organization’s business operations and reputation long-term. In addition to the technical burden and data recovery cost, another risk is the possibility of legal and financial penalties, which could cost your organization millions of dollars. 

The six steps of incident response

1. Preparation

Here are steps your incident response team should take to prepare for cybersecurity incidents:

  • Form an internal incident response team, and develop policies to implement in the event of a cyber attack
  • Review security policies and conduct risk assessments modeled against external attacks, internal misuse/insider attacks, and situations where external reports of potential vulnerabilities and exploits. (NIST provides a good framework.) 
  • Prioritize known security issues or vulnerabilities that cannot be immediately remediated – know your most valuable assets to be able to concentrate on critical security incidents against critical infrastructure and data
  • Develop a communication plan for internal, external, and (if necessary) breach reporting
  • Outline the roles, responsibilities, and procedures of the immediate incident response team, and the extended organizational awareness or training needs
  • Recruit and train team members, and ensure they have access to relevant systems, technologies and tools
  • Plan education for the extended organization members for how to report potential security incidents or information

2. Identification

Decide what criteria calls the incident response team into action. IT systems gather events from monitoring tools, log files, error messages, firewalls, and intrusion detection systems. This data should be analyzed by automated tools and security analysts to decide if anomalous events represent security incidents. For example, just seeing someone hammering against a web server isn’t a guarantee of compromise – security analysts should look for multiple factors, changes in behavior, and new event types being generated.

When an incident is isolated it should be alerted to the incident response team. Team members coordinate the appropriate response to the incident:

  • Identify and assess the incident and gather evidence.
  • Decide on the severity and type of the incident and escalate, if necessary.
  • Document actions taken, addressing “who, what, where, why, and how.” This information may be used later as evidence if the incident reaches a court of law.

3. Containment

Once your team isolates a security incident, the aim is to stop further damage. This includes:

  • Short-term containment — an instant response, so the threat doesn’t cause further damage. This can include taking down production servers that have been hacked or isolating a network segment that is under attack.
  • System backup — you should back up all affected systems before you wipe and reimage them to acquire a “current state” or forensic image. A forensic image is a bit-for-bit copy of a hard disk, or a specific disk partition. Disk images are created after an incident to maintain the state of a disk at a specific point in time and thus provide a static ‘snapshot,’ which you can use as evidence of the security incident, and to investigate how the system was compromised.
  • Long-term containment — While making temporary fixes to replace systems that have been taken down to image and restore, , rebuild clean systems so you can bring them online in the recovery stage. Take measures to prevent the incident from recurring or escalating: install any security patches on affected and associated systems, remove accounts and backdoors created by attackers, alter firewall rules, and change the routes to null route the attacker address, etc.

4. Eradication

Contain the threat and restore initial systems to their initial state, or close to it. The team should isolate the root cause of the attack, remove threats and malware, and identify and mitigate vulnerabilities that were exploited to stop future attacks. These steps may change the configuration of the organization. The aim is to make changes while minimizing the effect on the operations of the organization. You can achieve this by stopping the bleeding and limiting the amount of data that is exposed.

This is done as follows:

  • Identify and fix all affected hosts, including hosts inside and outside your organization
  • Isolate the root of the attack to remove all instances of the software
  • Conduct malware analysis to determine the extent of the damage
  • See if the attacker has reacted to your actions – check for any new credentials created or permission escalations going back to the publication of any public exploits or POCs.
  • Make sure no secondary infections have occured, and if so, remove them.
  • Allow time to make sure the network is secure and that there is no further activity from the attacker

Ensure your team has removed malicious content and checked that the affected systems are clean. For example, if the attacker used a vulnerability, it should be patched, or if an attacker exploited a weak authentication mechanism, it should be replaced with strong authentication.

5. Recovery

The purpose of this phase is to bring affected systems back into the production environment

carefully to ensure they will not lead to another incident. Always restore  systems from clean backups, replacing compromised files or containers with clean versions, rebuilding systems from scratch, installing patches, changing passwords, and reinforcing network perimeter security (boundary router access control lists, firewall rulesets, etc.)

Decide how long you need to monitor the affected network and endpoint systems, and how to verify that the affected systems are functioning normally. Calculate the cost of the breach and associated damages in productivity lost, human hours to troubleshoot and take steps to restore, and recover fully.

6. Lessons Learned

After any incident, it’s a worthwhile process to hold a debriefing or lessons learned meeting to capture what happened, what went well, and evaluate the potential for improvement. The incident response team and stakeholders should communicate to improve future processes. Complete documentation that couldn’t be prepared during the response process. The team should identify how the incident was managed and eradicated.

See what actions were taken to recover the attacked system, the areas where the response team needs improvement, and the areas where they were effective. Reports on lessons learned provide a clear review of the entire incident and can be used in meetings, as benchmarks for comparison or as training information for new incident response team members.

Who handles incident response? The Computer Incident Response Team (CSIRT)

To prepare for and attend to incidents, you should form a centralized incident response team, responsible for identifying security breaches and taking responsive actions. In a large organization, this is a dedicated team known as a CSIRT. The CSIRT includes full-time security staff. These individuals analyze information about an incident and respond. 

In a smaller organization, the incident response team can consist of IT staff with some security training, augmented by in-house or outsourced security experts.

The incident response team also communicates with stakeholders within the organization, and external groups such as press, legal counsel, affected customers, and law enforcement.

The team should include:

  • Incident response manager (team leader) — coordinates all team actions and ensures the team focuses on minimizing damages and recovering quickly. Prioritizes actions during the isolation, analysis, and containment of an incident. Oversees all actions and guides the team during high severity incidents.
  • Security analysts — the manager is assisted by a team of security analysts who work across departments to isolate and rectify flaws in the organization’s security systems, solutions, and applications. They recommend specific measures to improve the overall security posture.
  • Lead investigator — isolates root cause, analyzes all evidence, manages other security analysts, and conducts rapid system and service recovery.
  • Threat researchers — provide the context of an incident and threat intelligence. They use this information and records of previous incidents to create a database of internal intelligence. On many security teams, threat researchers are gradually replaced by automated threat intelligence tools.
  • Communications lead — communicates with all audiences inside and outside the company, including management, internal stakeholders, legal, press, and customers.
  • Documentation and timeline lead — documents team investigation, discovery, and recovery efforts. And, creates a timeline for each stage of the incident. Next-generation Security Information and Event Management (SIEM) systems are able to generate documentation and incident timelines automatically. For example, see the Exabeam Advanced Analytics module offered by the Exabeam Security Management Platform.
  • HR/legal representation — an incident could develop into criminal charges. Thus, you should have HR and legal guidance.

Incident response tools

Incident response tool typesWhy you need themTool examples
SIEMGathers and aggregates log data created in the technology infrastructure of the organization, including applications, host systems, network, and security devices (e.g., antivirus filters and firewalls). Provides reports on security-related incidents, including malware activity and logins. It also sends alerts if the activity conflicts with existing rule sets, indicating a security issue.Exabeam Security Operations Platform  (including Data Lake, Advanced Analytics, Incident Responder), QRadar, USM, ESM
Intrusion Detection Systems (IDS) — Network & Host-basedUses baselines or attack signatures to issue an alert when suspicious behavior or known attacks take place on a server, a host-based intrusion detection system (HIDS), or a network-based intrusion detection system (NIDS).Snort, Suricata, BroIDS, OSSEC, SolarWinds
Netflow AnalyzersLooks at actual traffic across border gateways and within a network. Netflow is used to track a specific thread of activity, to see what protocols are in use on your network, or to see which assets are communicating between themselves.ntop, NfSen, Nfdump
Vulnerability ScannersIsolates potential areas of risk, assesses the attack surface area of your organization for known weaknesses, and provides instructions for remediation. Vulnerabilities may be caused by misconfiguration, bugs in your own applications, or usage of third-party components that can be exploited by attackers.OpenVAS
Availability MonitoringThe aim of incident response is to limit downtime. A service or application outage can be the initial sign of an incident in progress. Availability monitoring stops adverse situations by studying the uptime of infrastructure components, including apps and servers. It tells the webmaster of issues before they impact the organization.Nagios
Web ProxiesControls access to websites and logs what is being connected. Many threats operate over HTTP, including being able to log into the remote IP address. The HTTP connection can also be essential for forensics and threat tracking.Squid Proxy

Incident response orchestration and automation

One of the key steps in incident response is automatically eliminating false positives (events that are not really security incidents), and stitching together the event timeline to quickly understand what is happening and how to respond.

Exabeam offers a next-generation Security Information and Event Management (SIEM) that provides Smart Timelines, automatically stitching together both normal and abnormal behaviors. This helps investigators accurately pinpoint a series of anomalous events, along with its associated assets, users, and risk reasons, all attached to a single timeline.

This automatic packaging of events into an incident timeline saves a lot of time for investigators, and helps them mitigate security incidents faster, significantly lowering the mean time to respond (MTTR).

What metrics are needed by SOC Analysts for effective incident response?

  • Mean Time to Detect (MTTD) — the effectiveness of your detection solution: Is it detecting most alerts or are the majority reported by users and system administrators? If your security operations team and their tools are not the greatest source of security alerts, you have an issue.
  • Detection accuracy/false positive rates — the percentage of alerts that, upon investigation, are revealed to not be valid threats. False positives reduce a security team’s confidence in its tools and draw attention away from serious underlying problems. False positive feedback loops should be included in any incident management process, but enterprises must guard against becoming too lenient; the only thing worse than a false positive is a false negative in which a serious threat is overlooked.
  • Mean Time to Respond/Repair (MTTR) — the time it takes to see a security concern, identify the impact, determine the course of action, and implement it. These numbers can vary widely but over time, trends will appear, providing useful insight about where you need to invest for additional protection, remediation, and automation capabilities.

Goals of incident response

The main goal of incident response is to coordinate team members and resources during a cyber incident to minimize impact and quickly restore operations. This includes:

  • Analysis — document the extent, priority, and impact of a breach to see which assets were affected and if the incident requires attention.
  • Reporting — tell team members of reporting procedures. Gather relevant trending data to show the importance of the incident response team.
  • Response — explore root causes, record findings, and carry out recovery strategies and communicate the status of your organization to team members.

In modern Security Operations Centers (SOCs), advanced analytics plays an important role in identifying and investigating incidents. User and Entity Behavior Analytics (UEBA) technology is used by many security teams to establish behavioral baselines of users or IT systems, and automatically identify anomalous behavior. This makes it much easier for  security staff to identify events that might constitute a security incident.

5 tips for successful incident response

1. Isolate exceptions

Technology alone cannot successfully detect security breaches. You should also rely on human insight. Following are a few conditions to watch for daily:

  • Traffic anomalies — sensitive connections and servers used internally will typically have a stable traffic volume. If you notice a sudden increase in traffic, take notice.
  • Accessing accounts without permission — privileged or administrator accounts have access to more information and systems than normal employees. However, employees tend to be the easiest entry point for cybercrime. Closely monitor privileged accounts and watch for privilege escalation on normal user accounts.
  • Excessive consumption and suspicious files — if you see an increase in the performance of the memory or hard drives of your company, it could be that someone is illegally accessing them or leaking data.

Modern security tools such as User and Entity Behavior Analytics (UEBA) automate these processes and can identify anomalies in user behavior or file access automatically. This provides much better coverage of possible security incidents and saves time for security teams. For example, see the Entity Analytics module, a part of Exabeam’s next-generation SIEM platform.

2. Use a centralized approach

Gather information from security tools and IT systems, and keep it in a central location, such as a SIEM system. Use this information to create an incident timeline, and conduct an investigation of the incident with all relevant data points in one place.

You can also use a centralized approach to allow for a quick automated response. Use data from security tools, apply advanced analytics, and orchestrate automated responses on systems like firewalls and email servers, using technology like Security Orchestration, Automation, and Response (SOAR).

3. Assert, don’t assume

Don’t conduct an investigation based on the assumption that an event or incident exists. Instead of making assumptions, make assertions, based on a question that you can evaluate and verify. For example “If I’ve noted alert X on system Y, I should also see event Z occur in close proximity.”

Create your assertions based on your experience administering systems, writing software, configuring networks, building systems, etc., imagining systems and processes from the attacker’s eyes.

4. Eliminate impossible events

You may not know exactly what you are looking for. On these occasions, eliminate occurrences that can be logically explained. You will then be left with the events that have no clear explanation.

For example:

  • Unexplained inconsistencies or redundancies in your code
  • Issues with accessing management functions or administrative logins
  • Unexplained changes in volume of traffic (e.g., drastic drop)
  • Unexplained changes in the content, layout, or design of your site
  • Performance problems affecting the accessibility and availability of your website

5. Take post-incident measures

Continue monitoring your systems for any unusual behavior to ensure the intruder has not returned. Watch for new incidents and conduct a post-incident review to isolate any problems experienced during the execution of the incident response plan.

Learn more about Incident Response

The Complete Guide to CSIRT Organization: How to Build an Incident Response Team
A computer security incident response team (CSIRT) can help mitigate the impact of security threats to any organization. As cyber threats grow in number and sophistication, building a security team dedicated to incident response (IR) is a necessary reality.

10 Best Practices for Creating an Effective Computer Security Incident Response Team (CSIRT)
In many organizations, a computer security incident response team (CSIRT) has become essential to deal with the growing number and increasing sophistication of cyber threats. Unlike a security operations center (SOC) — a dedicated group with the tools to defend networks, servers, and other IT infrastructure — a CSIRT is a cross-functional team that bands together to respond to security incidents. Some members may be full-time, while others are only called in as needed.

 How to Quickly Deploy an Effective Incident Response Policy
Almost every cybersecurity leader senses the urgent need to prepare for a cyberattack. If you haven’t already, most likely you’ll want to deploy an effective incident response policy soon, before an attack results in a breach or other serious consequences. Creating an effective incident response policy (which establishes processes and procedures based on best practices) helps ensure a timely, effective, and orderly response to a security event. In this blog, you’ll learn how to jumpstart the foundation of a good incident response policy that you can refine later to meet your organization’s unique needs.

 Incident Response Plan 101: How to Build One, Templates and Examples
An incident response plan is a set of tools and procedures that your security team can use to identify, eliminate, and recover from cybersecurity threats. It is designed to help your team respond quickly and uniformly against any type of external threat.

 IT Security: What You Should Know
Cyber attacks and insider threats have rapidly become more common, creative and dangerous. Many of these attacks are carried by threat actors who attempt to infiltrate the organizational network and gain access to sensitive data, which they can steal or damage. For this reason, the Information Technology (IT) team is one of the most critical components in the Security Operations Center (SOC) of any organization.

Incident Response Steps: 6 Steps for Responding to Security Incidents

When a security incident occurs, every second matters. Malware infections rapidly spread, ransomware can cause catastrophic damage, and compromised accounts can be used for privilege escalation, leading attackers to more sensitive assets.

Whatever the size of your organization, you should have a trained incident response team tasked with taking immediate action when incidents happen. Read on to learn a six-step process that can help your incident responders take action faster and more effectively when the alarm goes off.

Beat Cyber Threats with Security Automation
It is becoming increasingly difficult to prevent and mitigate cyber attacks as they are more numerous and sophisticated. Security teams often have no way to effectively manage the thousands of alerts generated by disparate security tools. To investigate these potential threats, analysts must also complete manual, repetitive tasks. Combined with the strain of insufficient time and headcount, many organizations simply cannot cope with the volume of security work.

 IPS Security: How Active Security Saves Time and Stops Attacks in their Tracks
An intrusion prevention system (IPS) is a network security technology that monitors network traffic to detect anomalies in traffic flow. IPS security systems intercept network traffic and can quickly prevent malicious activity by dropping packets or resetting connections.

See our Additional Guides on Information Security

For more in-depth guides on additional information security topics, see below:

Cybersecurity Threats Guide

Cybersecurity threats are intentional and malicious efforts by an organization or an individual to breach the systems of another organization or individual.

See top articles in our cybersecurity threats guide

SIEM Security Guide

SIEM security refers to the integration of SIEM with security tools, network monitoring tools, performance monitoring tools, critical servers and endpoints, and other IT systems.

See top articles in our SIEM security guide

User and entity behavior analytics Guide

UEBA stands for User and Entity Behavior Analytics which is a category of cybersecurity tools that analyze user behavior, and apply advanced analytics to detect anomalies.

See top articles in our User and Entity Behavior Analytics guide

Insider Threat Guide

An insider threat is a malicious activity against an organization that comes from users with legitimate access to an organization’s network, applications or databases.

See top articles in our insider threat guide

Security Operations Centers Guide

A security operations center (SOC) is traditionally a physical facility with an organization, which houses an information security team.

See top articles in our security operations center guide

DLP Guide

DLP is an approach that seeks to protect business information. It prevents end-users from moving key information outside the network.

See top articles in our DLP guide

Regulatory Compliance Guide

See top articles in our regulatory compliance guide

Tags: breaches, CSIRT,

Similar Posts

The 4 Steps to a Phishing Investigation

Log4j by Another Name. It’s Coming; How Can You Keep Pace?

What Can We Learn From the Lapsus$ Attacks?

Recent Posts

Understanding UEBA: From Scored Events to Stories

What’s New in Exabeam Product Development – November 2022

Exabeam News Wrap-up – December 1, 2022

See a world-class SIEM solution in action

Most reported breaches involved lost or stolen credentials. How can you keep pace?

Exabeam delivers SOC teams industry-leading analytics, patented anomaly detection, and Smart Timelines to help teams pinpoint the actions that lead to exploits.

Whether you need a SIEM replacement, a legacy SIEM modernization with XDR, Exabeam offers advanced, modular, and cloud-delivered TDIR.

Get a demo today!