Subscribe to get all the latest content! 👇

Ep 18 - Handling Incidents and Outages


2022-02-07
What do we tell people when things go wrong in our organisations? This week, there have been a couple of write-ups of recent high-profile outages at Roblox and Mozilla, which - when paired with the well-documented outage at Facebook that we discussed last season - gives us a fascinating glimpse into other companies' incident processes, on-call rotas and war rooms. Sanj, Gwen and Neil share their surprising love of being knee-deep in an incident, bringing some of their own recent experiences to the podcast.

TIMESTAMPS:
00:00 Start
01:28 The Stand-Up
06:48 Social Engineering
08:25 This Week's Epic
25:44 The Wash-Up

What do we tell people when things go wrong in our organisations? This week, there have been a couple of write-ups of recent high-profile outages at Roblox and Mozilla, which - when paired with the well-documented outage at Facebook that we discussed last season - gives us a fascinating glimpse into other companies' incident processes, on-call rotas and war rooms. Sanj, Gwen and Neil share their surprising love of being knee-deep in an incident, bringing some of their own recent experiences to the podcast.

In our workplace updates, there's lots of hiring, lots of shipping new features, everybody tries to coax Sanj into management, and Neil totally isn't doing any money laundering.

LINKS DISCUSSED THIS WEEK:

Elucidat careers page

YouTube: Ozark Season 1 Trailer

LeadDev

Glean careers page

Greg McKeown - Effortless

Facebook Engineering: Update about the October 4th outage

Roblox Return to Service 10/28-10/31 2021

Mozilla Hacks: Retrospective and Technical Details on the recent Firefox Outage

Vox: Pokémon Go launched in 26 countries, and then its servers crashed

Down Detector

Down For Everyone Or Just Me

Back Home

Copyright © techteamweekly.com 2022.