By John Egan
We need to flip this thinking around and view incidents as things that are happening every day, that need to be embraced, and that ultimately can be learned from and which contribute to the long-term resilience of the company.
In a report from the analyst firm Intellyx titled Modern Incident Management: Evolving Towards a More Positive Incident Culture, principal analyst Jason English writes,
For the longest time, it seemed like the primary metric of incident management was measuring the MTTR (the mean time to resolution) it took to find and fix something. Now the market is evolving, and perhaps it’s time the ‘mean-time’ metrics are replaced by a kinder, more positive incident culture. We must lean more than ever on digital collaboration with remote co-workers and flexible team structures to support fast-changing software and cloud infrastructures. Success in this new environment will be measured by improvement in our overall organizational resilience — the ability to learn from mistakes, and to bounce back faster and better over time.
So who is actually good at this? In tech, we always like to think we are always ahead of the curve, when in reality we are playing catch up to other industries. The truth is that the aviation industry is the crucible from which modern incident management grew; after a series of crashes in the late 80s and early 90s, there were deliberate decisions made to document as many incidents as possible, and then to ensure the learnings from those incidents were distributed as widely as possible.