One of the more persistent mistakes I see in cyber security is the belief that incident response begins when the alarm goes off. It does not. By the time an organisation is scrambling to decide who is in charge, which system matters most, whether legal needs to be called, or how to speak to customers, the incident is already winning.
I typically divide incident response into three phases: governance, which includes organisations adhering to regulatory compliance and guidance; planning and preparation, which includes playbooks, incident response plans, standard operating procedures (SOPs), and after-action reporting; and training and rehearsals, which include tabletop exercises, no-notice alert and response drills, and professional development activities.

When people hear the word “governance” they often seek shelter, roll their eyes, or reluctantly embrace the associated duties and responsibilities. I understand that word is not especially glamorous, and it rarely gets the same attention as threat feeds, tooling, or detection engineering. But governance is what gives incident response legitimacy, authority, and structure. It is where an organisation decides, in advance, how cyber incidents will be managed, who has decision rights, what regulatory and notification obligations apply, what risk thresholds trigger escalation, and how incident response fits with business continuity, crisis management, legal review, and executive reporting. That is not theoretical paperwork. It is command and control. Public guidance from the Australian Cyber Security Centre makes exactly this point by treating response planning as part of an organisation’s broader policy, strategy, recovery, review, and improvement posture. NIST’s updated incident response guidance now frames preparation through the wider functions of govern, identify, and protect, rather than treating response as an isolated technical activity. (Cyber Security Australia)
This is also where too many organisations fall short. They may have security tooling. They may even have a nominal incident response plan. But under pressure, they discover that the plan does not align with the operating environment, has not been exercised, does not reflect outsourced providers, and does not account for internal and external communications, regulator engagement, or evidence handling. Carnegie Mellon’s public incident response plan is instructive here because it explicitly ties response to authority, policy, data classification, stakeholders, legal coordination, evidence preservation, and reporting requirements. In other words, it treats incident response as an enterprise function, not a help desk escalation path. (Carnegie Mellon University)
The second component is planning and preparation. This is the part many executives think they have covered because there is a PDF somewhere on the intranet. As the military saying goes, “no plan survives first contact with the enemy;” in cyber security, corporate confidence rarely survives first contact with a real intrusion.
A credible incident response capability needs more than a high-level plan. It needs an incident response plan, incident-specific playbooks, and supporting standard operating procedures. The plan should define the framework, escalation logic, communications architecture, and decision-making model, while the playbooks should provide stepwise guidance for common scenarios such as phishing, ransomware, malware, denial of service, and data breaches. SOPs then translate those expectations into repeatable operational actions for analysts, infrastructure teams, legal, communications, and leadership.
ACSC’s practitioner guidance is explicit on this point: organisations should maintain a CIRP, develop specific playbooks for common incidents, document relevant SOPs, assign roles and responsibilities, and ensure even hard-copy versions are accessible if systems fail. Microsoft and AWS make the same point from an enterprise operations perspective, stressing RACI-style ownership, communication plans, incident definitions, escalation paths, and out-of-band communications as core planning artefacts rather than optional extras. (Cyber Security Australia)
Why does this matter? Because a mature plan minimises damage. It reduces delay in triage, containment, and decision-making. It helps the organisation distinguish noise from real compromise. It clarifies who can isolate hosts, who approves external notifications, who preserves evidence, and who decides whether business disruption is acceptable in order to stop adversary movement. Done properly, it also accelerates recovery because the organisation is not improvising under stress. The sequence of detection, response, recovery, and improvement is already understood, and teams can act with precision instead of panic. That is consistent with both NIST’s current framing and long-standing industry practice: incident response is most effective when preparation reduces uncertainty before the event.
This is where after-action reporting becomes indispensable. Too many organisations treat the end of containment as the end of the incident. It is not. Post-incident review is where maturity is built. It is where teams document what happened, what decisions were made, what worked, what failed, what dependencies were exposed, and what needs to change in governance, architecture, logging, vendor management, communications, and executive decision-making. AWS explicitly recommends after-action reports that identify strengths, improvement areas, and capability gaps so progress can be tracked over time.
NIST goes further and argues that lessons learned should be fed back into improvement continuously, not held in reserve until long after recovery is complete. That is exactly right. If you only learn after the dust settles, you are learning too slowly. (AWS Documentation)
My third component is training and rehearsal. This is the point at which incident response stops being a compliance artefact and starts becoming a capability.
A plan that has never been rehearsed is largely aspirational. Tabletop exercises, no-notice alerting drills, technical simulations, and professional development briefings are how organisations find the gaps they would otherwise discover during a breach. AWS’s incident response guidance is useful here because it describes tabletop exercises as discussion-based sessions for stakeholders to practise roles, responsibilities, and communications using established playbooks, and it recommends regular simulations that progress in complexity over time. NIST similarly emphasises that exercises and tests prepare staff and third parties for future response activities, while improvements should be identified from operational execution and lessons learned. ACSC also requires plans and SOPs to be reviewed or tested in exercises so personnel understand their responsibilities. (AWS Documentation)
As with live incidents, after-action reviews of performance, successes, and failures are critical and should be included in any “hotwash” an organisation conducts.
There is also a human reality here that many technical discussions miss. Incident response is not simply a workflow. It is a sociotechnical event. People are stressed, tired, overloaded, and often operating with imperfect information. Communication degrades. Decision-making narrows. Assumptions harden. That is why clear roles and responsibilities matter so much. It is why rehearsals matter. It is why communications leads, legal counsel, executives, infrastructure owners, and technical responders all need to know their place before the crisis starts. Microsoft’s public incident response material makes this point clearly by assigning dedicated governance, regulatory, investigation, infrastructure, and communications leads, and by stressing controlled messaging during service disruption. Real incidents are messy. Mature organisations prepare for that mess rather than pretending it will not happen. (Microsoft)
The practical lesson is straightforward. If your incident response capability is built only around technology, it is incomplete. Effective response rests on three pillars: governance that establishes authority and compliance alignment; planning that converts intent into plans, playbooks, and SOPs; and training that proves the organisation can actually execute under pressure.
That is what minimises damage. That is what enables faster recovery. And that is what gives people clarity in a crisis. Anything less is not incident response. It is hopeful improvisation.
Too many organisations mistake document possession for operational readiness. In incident response, that illusion tends to collapse at exactly the wrong moment.
References: ACSC Cyber security incident response planning: Practitioner guidance; NIST SP 800-61 Rev. 3; Carnegie Mellon University Computer Security Incident Response Plan; AWS Security Incident Response Guide; Microsoft incident response planning and playbook materials. (Cyber Security Australia)