Cause of Error

1. Summary

The summary is the first section of a CoE document, and should be no longer than a few paragraphs.

It should define, in as simple terms as possible, the broad scope of the error and failure that occurred. It should not define solutions, causes or actions - and it must be in a form that all stakeholders, including those not directly involved in the root-cause analysis, can understand and appreciate.

Imagine a CoE that starts with the following first sentence; this can also used as the Subject of an email:

Customers were unable to visit our site for 2 hours

(1 paragraph)

or

A customer saw another persons details and complained to a regulator

(2 paragraphs)

or

Postcode requests failed for 7 hours blocking direct debit payments

(1 paragraph)

or

Law enforcement enquiry identified pages and patterns used by terrorists

(2.5 paragraphs)

or

Customers saw the wrong price on 21 high-value items for 4 hours

(1 paragraph)

or

A robot stopped production for 9 hours

(1 paragraph)

Add more.

Everyone should be able to understand the summary. Be brief and to the point and do not diagnose, describe.

The CoE Summary should not be too long. A paragraph or three is the maximum, and if the event can be summarised in a single sentence or two, that is better.

Care should be taken so that the Summary is easy to read and accessible by all - it should contain no jargon or business specific language, and as largely as possible, should avoid reference to internal systems or processes that an outsider would not easily understand.

An external entity, or a customer, should be able to read the summary and understand the context of what is being described.

Explain it like I am five. Everyone should be able to understand this section, and if need be, go no further.

Produce a single document with seven sections:

1. Summary

A simple description of what happened.

2. Customer Impact

Describe the issue from the point of view of our customers. What did they see?

3. Security Impact

Was any system, data or privacy breached?

4. Timeline

Who did what when, and when the problem was resolved.

5. Five Whys

Keeping asking Why until you have a root cause. Dissect or deconstruct at every stage.

6. Lessons Learned

What did we learn from this problem?

7. Next Actions

Given the things we learned, what will we do next about this?

Implementation Notes

How to implement this method in practice.

v0.1 22/01/22