Conducting an Effective Postmortem

Conducting an Effective Postmortem

Wouldn’t life would be great if nothing ever went wrong and technology never broke?

Unfortunately, the reality is that things break all the time.

Your product probably breaks a little bit all the time and sometimes breaks big time (hopefully, much more occasionally).

Perhaps your website goes down for 4 hours during your busiest season and you miss $10M in potential revenue. Or, you discover that no payments have been processing for 2 days before anyone noticed.

In these serious to catastrophic cases, you really want to get to the bottom of why the problem occurred, with a view to making sure it never happens again.

That’s where conducting an effective Postmortem becomes vital.

Avoid Blame

The ultimate purpose of the Postmortem is to make sure the problem experienced never occurs again.

In order to do this, a Postmortem involves ferreting out root causes (more on this below). Bringing blame into the Postmortem process itself risks causing defensiveness and “ass-covering”.  This defensiveness tends to obscure the true root causes.

This is not to say that blame isn’t important or can be avoided. Perhaps someone needs to be fired.  But, keep the “blame” part separate from the Postmortem itself in order to get to the true root causes.

The Postmortem Process

I like to conduct the Postmortem process as a group, with all the stakeholders in a room in front of a whiteboard.  It’s important that everyone impacted and/or responsible for the problem is involved and has a voice.

At a high-level, my process for conducting a Postmortem is as follows:

  1. agree and define the impact of the problem to the business – e.g. “we lost $10M in potential revenue”
  2. flush out all the causes of the problem down to their root, as far as possible
  3. agree a set of recommendations aimed at ensuring the problem never occurs again

Let’s use a contrived example for illustration.  Imagine that I fell off my bike and broke my wrist.  We start on the whiteboard with the impact – i.e. I broke my wrist.

Why-Because

My preferred method to analyze causes (step #2 above) is Why-because Analysis.

Why-because is a formalized process but don’t be put off – it can be used more casually with great success and you can add rigor as you become more familiar.

Why-because essentially involves repeatedly asking “Why?” and repeatedly answering with the “because” part.  My 5-year old son is great at this.

e.g. “Why did I fall off my bike?” “…because I hit a pohole.”
“Why did you hit a pothole?” “…because I wasn’t looking where I was going.”
“Why weren’t you looking where you were going?” “…because I was distracted”

…you get the idea.

Why-because is similar to other processes you may be familiar with like “5 Whys”.  (I have found 5 whys to be insufficient because big problems typically have complex causes and the causal chains are often more than 5 levels deep.)

What you end up with at the end of the Why-because Analysis is a graph that shows you all the contributory causes that caused the impact on your business. More formally, when complete, the Why-because graph should include all the necessary and sufficient causes.

Continuing our example, here’s our Why-because analysis of why I broke my wrist:

Of course, you need to decide when you’ve gone deep enough and can stop asking Why? There is no hard-and-fast rule here – just use your judgement – but you don’t want to end up drilling down to “because the big bang happened” in every case.

One great thing about Why-because graphs like the one above is that you can test them to make sure they’re complete:

  • for each box on the chart, you can ask, had this not occurred, would the problem still have occurred? If the answer is no, it’s a necessary condition.
  • looking at all the boxes on the chart, you can ask, if all of these happened again, would the problem occur again? If the answer is no,  your conditions are not sufficient and you’re not done yet.

Generally, big problems tend to have complex causes. This is because any reasonably mature organization will have checks and balances in place to avoid obvious and predictable failures.

Therefore, you will likely end up with a complex graph that includes a mixture of technical, operational and human contributory factors. It’s particularly important not to overlook or underplay the human factors since fixing the technical and operational issues alone will not avoid the problem recurring.

You can read more about Why-Because on Wikipedia.

Recommendations

The most important part of the process is to create a list of recommendations to act on, informed by the detailed understanding of the causes from the Why-because analysis.

Don’t forget the human factors – these are often the most important to address, e.g. additional training, more staff or better process.

Again, you can test your recommendations by saying, if we do all these things, is it highly likely to prevent this problem from recurring again?  If the answer is no, you’ve not got the right recommendations.

Lastly, give each recommendation an owner who is responsible for taking action and be sure to follow-up.

How to Fire Someone Humanely

I’m surprised by how often I encounter someone in a leadership role who has never had to fire anyone. I suspect it’s a combination of technology companies generally having pretty flat organizations and also the tendency to have dedicated HR functions in larger companies that insulate people from any “unpleasant business”.

Whenever I’ve had to fire someone, I’ve not slept well the night before. Regardless of the reason, you are changing someone’s life and everyone deserves to be treated humanely. It definitely gets easier after you’ve done it a few times but I hope that it never becomes routine.

Here are my 10 recommendations for how to fire someone humanely:

1.  Immediately set the tone of the conversation upfront.  
It gets super-weird if you have a friendly “how are you?” conversation and then fire someone.  I normally get straight down to business and open with “I’m sorry we have to have a difficult conversation today” before the person even sits down.

2.  Do not use the word “fire”.
The word “fire” creates an emotional reaction – to“fire” someone implies an active act of aggression.  I generally say “this will be your last day”.  That way, it’s not a value judgment or a process – it’s just a fact.

3.  Keep it short.
There is no point in dragging it out.  But, you should at least give the person a short and true reason as to why it’s happening, unless there is a good legal reason not to.

4. Do not get drawn into arguments.  
If someone wants to argue, be clear that this is the decision, it’s already made, that you understand they are angry but that it’s not beneficial for either side to drag it out.

5. Have your paperwork in order and be aware of the local employment law.
For example, here in CA, it’s the law that you have to give an employee their final pay in the form of a check when they leave.

6. Say you’re sorry it didn’t work out.
It helps humanize you in the process. It might seem trite or hollow to say “sorry” but, on balance, I believe it helps soften the blow.

7. Don’t fire someone on a Friday.
You’ll read conflicting recommendations in this regard but I am strongly in the camp that says firing someone on a Friday is a Bad Idea(tm).  It means they are likely to just seethe about it all weekend.  If you fire them during the week, they are more likely to focus on finding another job.

8. Have others ready to cut the cord.
You will need to confide in a small group of people ahead of time so they are cued up to terminate account access, etc as soon as the person leaves the room. Create a list of accounts ahead of time so it is not a rush and nothing gets missed.
You may be tempted to think something like “I’ll just leave Dave’s email on until the end of the day”. Resist this temptation at all costs. Although this may seem counter to the “being humane” approach, the potential benefits in terms of seeming more humane and feeling better are massively outweighed by the downside risk of the person doing something stupid.

9. Treat the person with respect and try to make it as comfortable as possible.
If possible, try to make sure that their team mates are not around to gawp as the person clears their desk, etc. This will again require the confidence of someone you trust.

10. It should not be a surprise.
Firing someone should be culmination of a process of clear and honest communication over weeks if not months.  If you are a good manager, the person on the receiving end should be clear on the gap between what is expected of them and their performance.
With the exception of gross misconduct, if there hasn’t been such a dialogue, you probably shouldn’t be firing the person.