Wednesday, June 17, 2009

FACTORING LOW PROBABILITY, HIGH CONSEQUENCE ADVERSE EVENTS

One in a million. It will never happen.
When I hear this assurance in professional circles the hair on the back of my neck rises. An 'assessment' like this is usually an indicator that someone, somewhere has not done their homework. That we are leaving the world of facts and figures and entering the wild west world of chaos, packing a russian roulette loaded six shooter and kicking the door to the chance saloon.

Risk analysis can be as complex or as simple as you want. Fundamentally the question you are answering is 'Can I sustain the loss due to an adverse event'.
Where it gets interesting is where the realm of cold reason is replaced by naive response to the 'it will never happen'.

What is the benefit of designing your business system for an Asteroid strike?
An event that entails the description of the target area effects as 'uneven ground' hardly bears consideration. Outside of the psychological paralysis of even considering such 'high consequence' event, the cost of designing such a system exceeds the utility. What is the point of having a System survive an event when its users are dead or distracted by such basics like personal survival.
This is our first 'hard' limit on adverse event preparedness. The mitigation or avoidance means ideally must be such as can be reasonably expected to match 'community' (be it user or wider community) expectation in a similar system.
What does this mean in practice?
For example the 000 (911) first responder communication system is expected, to survive, by design any misadventure up to and including moderate civil disturbance.
As the utility of a system begins to diminish with the ability of first responders, relying on such auxiliaries as roads, hospitals, firestations etc still operational.
Translated to a more everyday business life, a Payroll system is expected to remain operational for as long as there are employees in the corporate body.

It is perhaps in the light of this when the first cracks appear in the corporate approach to high consequence, low likelihood events.
The first flaw is 'acceptance'. I have encountered entities that though obligated by law to maintain certain level of resilience build in the system, gamble on the low likelihood of the event. Accepting the adverse consequences of being unprepared, unlikely as they are. That is at best a concious decision.

Unlike the more common 'blind spot', ignorance by choice.
One of the assignments that I have completed entailed preparing a Business continuity plan for a moderately sized enterprise with focus on IT Disaster recovery.
IT systems typically are singled out for Business continuity above all else.
This specific enterprise showed an almost belligerent desire not to include pandemic workforce planning. It was simply an event that will never occur.
We may have a warm site, but if there are no warm bodies to run it, what is the use?

There are formulas in risk management that allow you to calculate a risk to the four decimal place. So how, do we determine, pragmatically what is the most appropriate risk posture for a low likelihood, high consequence event?

First: An assertion 'one in a million' typically implies some sort of assessment methodology where the incidence of the failure occurring is so small as to be significantly lower than the error rate of methodology. For example, one in a million means 0.00001% (1.0X10-6) likelihood. Even a precise methodology will claim nothing more than 1% measurement error. So such a claim means that our measurement error is significantly higher than the actual event. Akin to driving a truck through the eye of the needle at 150km/h. This realisation is one of those AHA! moments.

Second: Remember that any assessment that places an event as unlikely is likely not based on solid numbers (an opinion). If dinosaurs were sentient, they would probably find the idea of rocks falling from the sky laughable.

Third: Any solution to mitigate/accept rare catastrophic event must be in accordance with the reasonable standards of the 'community' of users. Unfortunately some executives read this to mean that if no other CEO's are planning for disaster, then they themselves do not need to.

It is often asserted that 80% of businesses fail within a year of disaster if no disaster plan is in place. So what is a good operator to do?
Do we ride without the helmet, knowing that if we crash we die?

The answer is, we mitigate the risks we can.
Call Mr. Wulf at NOMEONESTIQ to talk how we can help you.

No comments:

Post a Comment