Some links to products and partners on this website will earn an affiliate commission.
The BBC is reporting that British Airways’ nightmare a couple of weeks ago was caused by human error. More specifically, an ‘independent contractor’ incorrectly switched off the power supply at the data centre and caused a catastrophic power surge when they reconnected it. Is it really possible that so much BA chaos was caused by something so basic?
I’m not much of a tech person (to say the least), so I don’t aim to answer that question myself, but to my entirely non-specialist ears, it does all sound pretty remarkable!
We are talking about an outage that caused havoc to the travel plans of 75,000 people and that it is estimated will cost British Airways in excess of £100 million. I mean, we’ve all made mistakes with technology (me more than most…) and many of us probably don’t back things up as often as we should etc, but I doubt even a dedicated team of hackers could manage to cause that sort of devastation without months of deliberate planning.
Maybe I’m a bit naive when it comes to technology and big businesses, but aren’t there supposed to be back-ups and fail-safes to prevent exactly this sort of entirely normal (and therefore predictable) human idiocy from causing massive problems?
Precise details on what happened are still pretty sketchy, but an internal BA email leaked to the BBC states,
“This resulted in the total immediate loss of power to the facility, bypassing the backup generators and batteries… After a few minutes of this shutdown, it was turned back on in an unplanned and uncontrolled fashion, which created physical damage to the systems and significantly exacerbated the problem.”
I just can’t even begin to get my head around how the BA data centre essentially seems to have a plug that you can pull out that switches off everything… And then breaks everything when you switch it back on.
Boss of IAG, Willie Walsh helpfully commented,
“It’s very clear to me that you can make a mistake in disconnecting the power. It’s difficult for me to understand how to make a mistake in reconnecting the power.”
Err, thanks for clearing that all up then Willie!
Do we have any IT experts reading who can shed some light in the comments on how plausible this all really is?
Adam says
All companies of a certain size should of have policies in place and emergency planning which should get practiced on a regular basis in case of such a problem.
It appears BA outsourcing their IT in another bid to save money has completely backfired, 100 million and rising in compensation but the negative impact for business could be catastrophic for BA in the future.
To the normal cash paying traveller, would you choose BA if you had a choice now!
Joe Deeney says
Indeed – not exactly a PR triumph, that’s for sure…
Craig Sowerby says
When your Windows computer gets all screwed up, aren’t you supposed to turn it off and turn it back on again? Obviously BA is still running Windows 95 in its servers… 😀
Joe Deeney says
I wouldn’t be entirely surprised!
Graham says
Computer systems need to be shut down and restarted “cleanly” that is via a prescribed series of steps. Realise most people just hit the on/off button when things go wrong on their pc’s but servers and databases don’t work like that.
Good chance that panic set in and engineer plugged the server back in and that when the proverbial hit the fan.
Joe Deeney says
Yes, I assumed it would be something like that – but for SUCH a business critical infrastructure element, shouldn’t there be extremely strict protocols?