A story about blameless post-mortems
A told B and B told C, ‘I’ll meet you at the top of the coconut tree. “Whee!” said D to E F G, “I’ll beat you to the top of the coconut tree.” Chicka chicka boom boom! Will there be enough room?
You just launched a Hot new product or feature. People are Interested, and you Just got a call from Kyle in Legal. A digital Media News Outlet wants to Publish a story, and Kyle wants to make sure you’re Qualified to take the interview. Your marketing team just Released a stellar Strategy and it’s Taking off like a rocket. Usage is up and the team is feeling Victorious.
Still more — W! and X Y Z! The whole alphabet up the — Oh, no! Chicka Chicka… BOOM! BOOM!
Your servers can’t handle the load. Pagers are going off everywhere. Customer success is rattled, the executives want to know when the system will be back online, and finance is counting every penny of lost revenue while you get the incident under control. How will you and your team react?
Here’s a few tips from Chicka Chicka Boom Boom, a children’s book by Bill Martin Jr and John Archambault, illustrated by Lois Ehlert.
Skit skat skoodle doot. Flip flop flee.
The mama and papa letters in Chicka Chicka Boom Boom don’t panic. They move urgently, but confidently to help their little dears. Having plans and procedures in place before a big launch allows you to respond quickly and urgently, while still maintaining a semblance of business as usual.
As part of your launch planning process, invite a cross-functional team to imagine all the reasons your project could turn into a miserable failure. Then figure out how you can prevent those problems now, while there’s still time. This practice is called a pre-mortem exercise, and there’s lots of templates for how to run one effectively.
Identify what monitoring you’ll need in place to have an early warning system before things go BOOM! Set up PagerDuty to promptly notify key individuals on the team if they do, and maintain playbooks with step-by-step instructions so anyone can spin up more servers, reboot a database, shut down a security threat or perform other common troubleshooting actions.