The Postmortem Meeting
Purpose#
After you have completed the written postmortem, follow up with a meeting to discuss the incident. The purpose of this meeting is to deepen the postmortem analysis through direct communication and to get buy-in for action items. The asynchronous production of the written postmortem helps the team start learning from the incident, but having a conversation leads to deeper learning. Furthermore, having a meeting scheduled to discuss the written postmortem creates accountability for the postmortem to be completed in a timely manner. Using this time to discuss action items also helps ensure that those tasks will be completed.
An anti-pattern for the postmortem meeting is to be overly focused on the immediate concerns documented in the written postmortem. Avoid filling the meeting time by simply reading through each section of the document. The best use of this time is to take a step back from the detailed analysis to better understand the systemic factors that led to the incident.
Some teams make use of the Retrospective Prime Directive to set the tone for the meeting and serve as a regular reminder of the goals. It can be a helpful tool to anchor the discussion and provide a clean slate to start a retrospective, postmortem, or post-incident review.
"Regardless of what we discover, we understand and truly believe that everyone did the best job they could, given what they knew at the time, their skills and abilities, the resources available, and the situation at hand." --Norm Kerth, Project Retrospectives: A Handbook for Team Review
The most important outcome of the postmortem meeting is buy-in for the action plan. This is an opportunity to discuss proposed action items, brainstorm other options, and gain consensus among team leadership. Sometimes the ROI of proposed action items is not great enough to justify the work or postmortem action items must be delayed for other priorities. The postmortem meeting is a time to discuss these difficult decisions and make clear what work will and will not be done, as well as the expected implications of those choices.
Whereas the written postmortem is intended to be shared widely in the organization, the primary audience for the postmortem meeting is the teams directly involved with the incident. This meeting gives the team a chance to align on what happened, what to do about it, and how they will communicate about the incident to internal and external stakeholders.
Tip
Send a link to the postmortem document to meeting attendees 24 hours before the meeting. Though the postmortem does not need to be complete when it is sent to the attendees, it should be finished before the postmortem meeting. It is still worth sending an incomplete postmortem to meeting attendees in advance so they can start reading through the document.
This will help you avoid wasting time in the meeting simply reading through the document. Remember the purpose of the meeting is to have an in-depth conversation about what caused the incident and how to prevent it in the future, not to review the document. The postmortem meeting is also an opportunity to clarify any questions about what happened and what the team plans to do to prevent it from happening again. Encourage attendees to ask any and all questions to help everyone get on the same page and help the team consider new perspectives for their analysis.
Agenda#
Here is a sample agenda for the meeting:
- Postmortem owner summarizes incident causes and timeline, and leads discussion:
- What were the larger cultural and structural factors that lead to the incident? How did we get here?
- Postmortem owner summarizes proposed follow-up action items, and leads discussion:
- Is the team confident this plan will reduce the likelihood of this incident recurring?
- What more or different work might be needed?
- Will team leadership (Engineering Manager, Product Manager, Tech Lead, etc.) commit to prioritizing these action items?
- Postmortem owner (or company Incident Manager, if present) summarizes customer impact.
- Provide any new context about customer reaction to the incident.
Who Participates#
The postmortem owner invites the following people to the postmortem meeting. Below is more detail about the role each plays in the discussion.
- Always
- Service owners and other key engineers involved in the incident.
- On-call service owners and other engineers that responded to the incident are the experts of the affected services. During the postmortem meeting they can provide historical context about how the systems were built, cultural context about what was happening with the team leading up the incident, and proposals for what work would reduce the likelihood of this incident recurring.
- Productive postmortem discussions will include engineers with in-depth knowledge of the part of the system that their team owns. If the engineer(s) that responded to the incident are newer to the team, it will be helpful to include more experienced engineers from their team in the postmortem meeting.
- Engineering manager for impacted systems.
- The manager responsible for the teams that responded to the incident attends the postmortem meeting to inform their staffing and technical investment decisions
- Product manager for impacted systems.
- Product managers attend postmortem meetings to understand the effect incidents have on their customers' experience. For postmortem action items to be prioritized and completed, it is critical to engage product managers in this discussion of the importance and scope of proposed follow-up tasks.
- Service owners and other key engineers involved in the incident.
- Optional (Only major incidents)
- Company Incident Manager.
- The incident manager can speak to customers' reactions to the incident. They need to understand the team's decision on action items so they can finalize and send external messaging, if needed.
- Company Incident Manager.