Thursday, June 25, 2009

EMAIL EXCHANGE PATTERNS CAN PREDICT IMPENDING CORPORATE DISASTER

Would it not be nice to see the cracks in a dam-wall of a corporation that is about to undergo a catastrophic collapse?
A cybernetic crystal ball if you will.

As an organisation nears a point of a major corporate catastrophe there are certainly some indications of it. Indicators available to some, but not all of the personnel. An astute Director or CEO might certainly be in the position, assuming experience and eagle eye, high level oversight.
Financial reports, Revenues or exposures, external climate (both atmospheric and political), coffee room conversation... There are however many if's in the above telltales.

Would it not be nice if there was a dashboard indicator;
an 'All is well -----+----- Run for the hills' telltale. Well, it appears that some innovative research may actually give us the capability to do just that. Using an unlikely tool, email (e.g.: Exchange) messaging logs.

Ronald Menezes an Associate professor with Research interests in Complex Networks has recently presented to the International Workshop on Complex Networks. Summarising his research, Dr Menzes has analysed the MS-Exchange logs of Enron's top ~150 staffers (the deciders) in the 18 months immediately before its collapse.

The scientific method (Karl Popper's take on it) calls for postulating a theorem and then attempting to disprove it. Dr. Menezes was expecting to find a burst in communications after crisis event, what he had found, astounded him. It appears that prior to the key disaster events, the communications increased by almost 90%, further communication became clustered. Key personnel split into 'cliques', an e-space equivalent of huddled groups whispering amongst themselves, falling silent when a third party nears. A good analogy since the increase in communication was matched by a corresponding fall in communication with others.

So what does that mean?
First, the data is content independent one does not need to search the actual body of the email. Merely the list of the addresses. A complex social network can be established from that datum. That information alone is valuable and can be used e.g.: for fraud detection using Link Analysis.

This meta-data, notable only for its anomalous frequency can then be used for predicting a crisis scenario.

In practice, it is then possible to construct an addon for MS-Exchange or other commercial mail packages where this algorithm is used as an early warning sign. Of course, certain caveats must be considered.
  • First, the baseline must be established so that we know the ordinary ebbs and flows of email volumes.
  • Second, a note of caution that whilst a good indicator of anomalous condition, it may be imprudent to start ringing alarm bells based purely on this single indicator.
A more appropriate response would be to heighten the Corporate Governance posture.

Whilst by this late in the game, the prevention of a disaster may not be possible, certainly certain steps can be taken to soften the blow. If only in preparing appropriate PR response that is so essential in handling many of the large corporate wrecks.

Call Mr. Wulf at NOMEONESTIQ to talk how we can help you.

Wednesday, June 17, 2009

FACTORING LOW PROBABILITY, HIGH CONSEQUENCE ADVERSE EVENTS

One in a million. It will never happen.
When I hear this assurance in professional circles the hair on the back of my neck rises. An 'assessment' like this is usually an indicator that someone, somewhere has not done their homework. That we are leaving the world of facts and figures and entering the wild west world of chaos, packing a russian roulette loaded six shooter and kicking the door to the chance saloon.

Risk analysis can be as complex or as simple as you want. Fundamentally the question you are answering is 'Can I sustain the loss due to an adverse event'.
Where it gets interesting is where the realm of cold reason is replaced by naive response to the 'it will never happen'.

What is the benefit of designing your business system for an Asteroid strike?
An event that entails the description of the target area effects as 'uneven ground' hardly bears consideration. Outside of the psychological paralysis of even considering such 'high consequence' event, the cost of designing such a system exceeds the utility. What is the point of having a System survive an event when its users are dead or distracted by such basics like personal survival.
This is our first 'hard' limit on adverse event preparedness. The mitigation or avoidance means ideally must be such as can be reasonably expected to match 'community' (be it user or wider community) expectation in a similar system.
What does this mean in practice?
For example the 000 (911) first responder communication system is expected, to survive, by design any misadventure up to and including moderate civil disturbance.
As the utility of a system begins to diminish with the ability of first responders, relying on such auxiliaries as roads, hospitals, firestations etc still operational.
Translated to a more everyday business life, a Payroll system is expected to remain operational for as long as there are employees in the corporate body.

It is perhaps in the light of this when the first cracks appear in the corporate approach to high consequence, low likelihood events.
The first flaw is 'acceptance'. I have encountered entities that though obligated by law to maintain certain level of resilience build in the system, gamble on the low likelihood of the event. Accepting the adverse consequences of being unprepared, unlikely as they are. That is at best a concious decision.

Unlike the more common 'blind spot', ignorance by choice.
One of the assignments that I have completed entailed preparing a Business continuity plan for a moderately sized enterprise with focus on IT Disaster recovery.
IT systems typically are singled out for Business continuity above all else.
This specific enterprise showed an almost belligerent desire not to include pandemic workforce planning. It was simply an event that will never occur.
We may have a warm site, but if there are no warm bodies to run it, what is the use?

There are formulas in risk management that allow you to calculate a risk to the four decimal place. So how, do we determine, pragmatically what is the most appropriate risk posture for a low likelihood, high consequence event?

First: An assertion 'one in a million' typically implies some sort of assessment methodology where the incidence of the failure occurring is so small as to be significantly lower than the error rate of methodology. For example, one in a million means 0.00001% (1.0X10-6) likelihood. Even a precise methodology will claim nothing more than 1% measurement error. So such a claim means that our measurement error is significantly higher than the actual event. Akin to driving a truck through the eye of the needle at 150km/h. This realisation is one of those AHA! moments.

Second: Remember that any assessment that places an event as unlikely is likely not based on solid numbers (an opinion). If dinosaurs were sentient, they would probably find the idea of rocks falling from the sky laughable.

Third: Any solution to mitigate/accept rare catastrophic event must be in accordance with the reasonable standards of the 'community' of users. Unfortunately some executives read this to mean that if no other CEO's are planning for disaster, then they themselves do not need to.

It is often asserted that 80% of businesses fail within a year of disaster if no disaster plan is in place. So what is a good operator to do?
Do we ride without the helmet, knowing that if we crash we die?

The answer is, we mitigate the risks we can.
Call Mr. Wulf at NOMEONESTIQ to talk how we can help you.