Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

IT Crash Causes British Airways To Cancel All Flights (cnbc.com) 184

Posted by EditorDavid on Saturday May 27, 2017 @03:44PM from the flight-delays dept.

An anonymous reader quotes CNBC: British Airways canceled all flights from London's Heathrow and Gatwick airports on Saturday as a global IT failure upended the travel plans of tens of thousands of people on a busy U.K. holiday weekend. The airline said it was suffering a "major IT systems failure" around the world. Chief executive Alex Cruz said "we believe the root cause was a power-supply issue and we have no evidence of any cyberattack." He said the crash had affected "all of our check-in and operational systems." BA operates hundreds of flights from the two London airports on a typical day -- and both are major hubs for worldwide travel. Several hours after problems began cropping up Saturday morning, BA suspended flights up to 6 p.m. because the two airports had become severely congested. The airline later scrapped flights from Heathrow and Gatwick for the rest of the day.

This discussion has been archived. No new comments can be posted.

IT Crash Causes British Airways To Cancel All Flights

Load All Comments

Search 184 Comments Log In/Create an Account

Comments Filter:

a power supply failure?? (Score:5, Insightful)

by Anonymous Coward writes: on Saturday May 27, 2017 @03:49PM (#54498723)

So a power supply failure can bring down all operations on a global scale. Good to know that BA had outsourced part of their IT staff to India!!!

Share
twitter facebook
- Re: (Score:3, Funny)
  
  by Anonymous Coward writes:
  
  BA is awaiting a call from an Indian claiming to be from Microsoft saying they have a faulty PC and it sending out signals to Microsoft and getting to look at the event log. This alerted BA to pay the individual $150 dollars to fix their PC by download TeamViewer
- Re:a power supply failure?? (Score:5, Funny)
  
  by dbIII ( 701233 ) writes: on Saturday May 27, 2017 @09:53PM (#54499745)
  
  So a power supply failure can bring down all operations on a global scale. Good to know that BA had outsourced part of their IT staff to India!!!
  As another poster quoted "BA in 2016 made hundreds of dedicated and loyal IT staff redundant and outsourced the work to India".
  
  Parent Share
  twitter facebook
- - Re: (Score:2)
    
    by jimtheowl ( 4200185 ) writes:
    
    "where local Uk residents felt they did not need to do any kind of periodic maintenance"
    
    It is equally possible that the 'UK residents' were sacked because management thought they were not needed because it seemed cheaper on the short term.
    
    As for moving the data centers to India, it is a bad idea for the heat alone, but there are also very good reasons why there are laws to that effect.
    - - Re: (Score:2)
        
        by jimtheowl ( 4200185 ) writes:
        
        We are not just talking about heat.
        
        If we disregard the fact that the power supply story is absolute bullshit, did you have an actual place in mind that would have redundant fiber optic links and reliable power for a data center?
        
        Have you given any thought to network latency?
Busy U.K. Holiday Weekend... (Score:2)

by __aaclcg7560 ( 824291 ) writes:

I didn't realized that the British celebrated U.S. Memorial Day weekend.
- Re: (Score:3)
  
  by ShanghaiBill ( 739463 ) writes:
  
  I didn't realized that the British celebrated U.S. Memorial Day weekend.
  Ramadan starts today, and Monday is the Spring Bank Holiday, when many schools and businesses close.
  - Re: (Score:2, Funny)
    
    by Anonymous Coward writes:
    
    Ramadan starts today
    Maybe it's a good thing all the flights are grounded, then.
    ***MUSLIM EXPLODES***
    - Re: (Score:2)
      
      by K. S. Kyosuke ( 729550 ) writes:
      
      Hey, it's Ramadan, not Ram-a-dam.
    - - Re: Busy U.K. Holiday Weekend... (Score:2)
        
        by Type44Q ( 1233630 ) writes:
        
        No craters; "it's all ball bearings nowadays." [youtube.com]
- Re:Busy U.K. Holiday Weekend... (Score:5, Informative)
  
  by Ecuador ( 740021 ) writes: on Saturday May 27, 2017 @04:38PM (#54498875) Homepage
  
  It is actually "(Late) Spring bank holiday". The UK has depoliticized and dereligionized most of their holidays (notable exceptions are Christmas and Easter), so there is a bunch of "bank holidays" around the year that fall on Mondays (to provide extended weekends). This particular holiday seems to have replaced "Whit Monday" (day after Pentecost), which was a moveable Christian holiday. So, as you should expect, it is not related to the US Memorial Day.
  The equivalent to the US Memorial Day for the UK (and Commonwealth nations) is the "Remembrance day" on November 11th (end date of WWI), which is not a bank holiday (so you normally go to work that day, usually wearing a poppy).
  
  Parent Share
  twitter facebook
  - Re: (Score:2)
    
    by bjdevil66 ( 583941 ) writes:
    
    The United States also has a military-based holiday on November 11th (Veterans Day). Unlike Memorial Day, many businesses/institutions don't take that day off either. Not sure if you know about it, but based on your description I'd say Veterans Day sounds more like Remembrance Day than Memorial Day - which these days is more known for hot dogs, parties, and retail "sales events" than cleaning up grave sites and remembering veterans' sacrifices.
- Re: (Score:1)
  
  by Anonymous Coward writes:
  
  I didn't realized that the British celebrated U.S. Memorial Day weekend.
  Believe it or not, the world does not revolve around the US. Other countries celebrate holidays of their own that have nothing to do with the US. Some of them happen to fall on the same dates as the US happens to be celebrating something, others don't. Shocking, but true.
  - - Re: (Score:2, Insightful)
      
      by Anonymous Coward writes:
      
      The international language of business is English.
      Did you know that England isn't in the US?
    - Re: (Score:2)
      
      by __aaclcg7560 ( 824291 ) writes:
      
      When is the last time Russian or Chinese produced a movie that became a global "blockbuster"?
      How American movies would be blockbusters without Chinese funding and ticket sales?
      https://www.forbes.com/sites/markhughes/2017/03/04/how-china-has-taken-over-the-worldwide-box-office-in-2017/ [forbes.com]
      When is the last time you saw any large political protests in the US targeting any foreign leader?
      How about Turkey's president Recep Tayyip Erdogan?
      http://www.vanityfair.com/news/2017/05/turkey-demands-apology-beating-up-us-protesters [vanityfair.com]
    - Re: (Score:2)
      
      by Applehu Akbar ( 2968043 ) writes:
      
      " When is the last time Russian or Chinese produced a movie that became a global "blockbuster"?"
      The latest film to win 8 Oscars was Indian.
outsourcing (Score:2, Insightful)

by Anonymous Coward writes:

MBA to board: I've got a great idea to cut costs! It will save millions!
- Re: (Score:2)
  
  by __aaclcg7560 ( 824291 ) writes:
  
  And everyone else is doing it too!
  - Re: (Score:2)
    
    by plopez ( 54068 ) writes:
    
    Which is exactly what they said, IT services are now provided globally by a range of suppliers and this is very common practice across all industries" see the Register for the qoute
    http://www.theregister.co.uk/2... [theregister.co.uk]
    As my parents replied when I used that excuse as a child, "If everyone was jumping off a 100 ft cliff would you jump off too?"
    MBAs are herd animals
- Re: (Score:3)
  
  by rholtzjr ( 928771 ) writes:
  
  Reality: they will write this off as having a bad day and continue with them getting their multi-million bonus at the end of the year. In other words, nothing will change.
  - Re: (Score:2)
    
    by plopez ( 54068 ) writes:
    
    Never admit failure, always claim the victory.
Somewhere, an IT guy is crying (Score:5, Insightful)

by whoever57 ( 658626 ) writes: on Saturday May 27, 2017 @03:55PM (#54498745) Journal

Somewhere, there is probably an IT guy who has been begging for the budget to upgrade some old machines, or move the services onto a cloud provider and was ignored.
He's crying today, because this huge revenue loss could probably have been avoided with a small budget for newer hardware or more redundancy.

Share
twitter facebook
- Re:Somewhere, an IT guy is crying (Score:5, Insightful)
  
  by nadaou ( 535365 ) writes: on Saturday May 27, 2017 @04:39PM (#54498879) Homepage
  
  He's crying today, because this huge revenue loss could probably have been avoided with a small budget for newer hardware or more redundancy.
  And despite that s/he knows who will take the blame for it.
  
  Parent Share
  twitter facebook
- Re:Somewhere, an IT guy is crying (Score:5, Insightful)
  
  by grasshoppa ( 657393 ) writes: on Saturday May 27, 2017 @04:42PM (#54498889) Homepage
  
  move the services onto a cloud provider
  "Cloud" service providers have no place in mission critical roles by virtue that the "Cloud" is a faster way of saying "abdicating responsibility". If you make millions of dollars a day on the back of your IT infrastructure, then the last thing you do is outsource the responsibility of said infrastructure to a 3rd party company which has different priorities than you do.
  Any IT manager making such a recommendation is a) lazy, b) useless and c) should be fired.
  
  Parent Share
  twitter facebook
  - "Cloud" is used to encourage cloudy thinking. (Score:2)
    
    by Futurepower(R) ( 558542 ) writes:
    
    Quote: ... "Cloud" is a faster way of saying "abdicating responsibility."
    
    The word "cloud" is used by cloud providers to encourage cloudy thinking: Dilbert cartoon. [pinimg.com]
    
    This Dilbert cartoon [cloudave.com] shows where cloudy thinking is leading.
  - Re:Somewhere, an IT guy is crying (Score:5, Insightful)
    
    by AmiMoJo ( 196126 ) writes: on Saturday May 27, 2017 @05:20PM (#54499023) Homepage Journal
    
    The only alternative is to spend vast amounts of money building your own redundant systems, which clearly BA were unwilling to do. Using cloud services makes perfect sense.
    Take Amazon's cloud services as an example. To get that kind of reliability, with systems distributed around the world for responsive operation an redundancy you are going to need a large number of geographically distributed services and a team to look after them. A team that is available 24/7 with response times in minutes.
    And you will still have the same local problems, like internet connection reliability, and the same development problems. You don't have to waste time and effort administering your own servers either, dealing with mundane stuff like HDD failures or managing 30 different datacentre operators.
    Unless your company is willing to put a massive amount of effort into that stuff for some reason, it's dumb to even try.
    
    Parent Share
    twitter facebook
    - Re:Somewhere, an IT guy is crying (Score:5, Insightful)
      
      by Kjella ( 173770 ) writes: on Saturday May 27, 2017 @08:01PM (#54499465) Homepage
      
      Unless your company is willing to put a massive amount of effort into that stuff for some reason, it's dumb to even try.
      IAG (the holding group of British Airways) have a market cap of 13 billion GBP or about 17 billion USD, guesstimating by fleet size BA is almost half of that. I'd understand if you were talking about a fly speck of a company but an 8 billion dollar company can damn well run their own infrastructure without a cloud provider with geographical distribution, 24/7 available teams and all that.
      
      Parent Share
      twitter facebook
      - Re: Somewhere, an IT guy is crying (Score:2, Insightful)
        
        by Anonymous Coward writes:
        
        Netflix has a $70B market cap and they run in AWS and have no intention whatsoever to go back to traditional data centers. Running them well is hard and hiring competent people to run them is even harder: Amazon, Google, Facebook and Microsoft hired a good chunk of the talent. Zinga tried AWS, found it expensive so went back to traditional Datacenter and cambe back running to Amazon.
        I have worked on several migrations to AWS and the dumb companies do it without changing their software architecture.
        Newsflash
        
        Re: Somewhere, an IT guy is crying (Score:5, Insightful)
        
        by fluffernutter ( 1411889 ) writes: on Saturday May 27, 2017 @09:46PM (#54499725)
        
        Usually the problem is when they go in house they want their administrators to work for $15 an hour, and then when they can't find good ones and the systems fail, they throw their hands up and go back to paying way more for cloud services than proper admins would have cost in the first place.
        
        Parent Share
        twitter facebook
        
        Re: (Score:2)
        
        by Etcetera ( 14711 ) writes:
        
        Netflix has a $70B market cap and they run in AWS and have no intention whatsoever to go back to traditional data centers. Running them well is hard and hiring competent people to run them is even harder: Amazon, Google, Facebook and Microsoft hired a good chunk of the talent. Zinga tried AWS, found it expensive so went back to traditional Datacneveral migrations to AWS and the dumb companies do it without chewing their software architecture.
        Newsflash, AWS is expensive if you do not rewrite your code for auto scaling and Setup your QA/Dev environments to be 'on demand'. But as far as uptime is concerned, you cannot beat Amazon uptime if you have built multi-region deployments. If you use it like a traditional data center, well shit happens and Amazon machines die like any other. Their SLAs actually guarantee nothing about individual hosts.
        That's horsecrap, and the "everyone else is wrong so use our new paradigm to overlook our problems" attitude is exactly what's wrong with it.
        If you're making consumer-level applications that don't need data integrity then it might be "good enough". If you need enterprise reliability, it's absolutely not "good enough" without a lot more engineering.
        There are plenty of competent datacenter operations folks, and running one is NOT rocket science, it just requires awareness of reliability engineering and infra
      - Re: (Score:3)
        
        by thegarbz ( 1787294 ) writes:
        
        Market cap is quite a useless figure to quote when talking about base spending for something that doesn't create dollars. BP's market cap was 7x that of IAG when it almost went completely under during the spill. Likewise many airlines have a huge market cap but are struggling financially to stay a float.
        A better metric to use would be cash flows, and BA is doing incredibly well for an airline in that regard adding 230mm GBP to their balance sheet last year. This free capital shows a company's ability to inv
    - Re: (Score:2)
      
      by dbIII ( 701233 ) writes:
      
      The only alternative is to spend vast amounts of money building your own redundant systems
      Unless it doesn't actually cost vast amounts of money and in the long run ends up cheaper than outsourcing to people who are (or should be) spending money on the same thing and charging you extra money on top.
      Beyond a certain point it gets cheaper to do things yourself instead of paying for someone else's profits.
      - Re: (Score:2)
        
        by Hognoxious ( 631665 ) writes:
        
        A big company is going to be big enough to gain from economies of scale.
        A smaller company not so much. You can't buy half a server, and a top notch machine doesn't require twice as many admins as one half the power. It makes more sense for them. It's a way of pooling resources, in a way.
        I'd say BA is big.
        
        Slashdot - land of one liners (Score:2)
        
        by dbIII ( 701233 ) writes:
        
        Hence the second sentence starting with "Beyond a certain point".
        WTF is it with people not reading past one line here recently?
        
        Re: (Score:2)
        
        by Hognoxious ( 631665 ) writes:
        
        If that's how you react when people agree with you you can just fuck off.
        
        Re: (Score:2)
        
        by dbIII ( 701233 ) writes:
        
        It's not just you. It's a bit of a trend. Don't take it so personally.
    - Re:Somewhere, an IT guy is crying (Score:4, Interesting)
      
      by sjames ( 1099 ) writes: on Sunday May 28, 2017 @03:19AM (#54500453) Homepage Journal
      
      What concerns me though is that ion spite of that, Amazon went down due to a thunderstorm. And again due to fat fingering a re-configuration.
      I run my own servers. Admittedly on a much smaller scale, but Amazon has had 3 failures since the last one I had.
      I can see use cases for the cloud but it's not going to give you proper high availability.
      
      Parent Share
      twitter facebook
      - Re: (Score:2)
        
        by sjames ( 1099 ) writes:
        
        Quite true. Or, like megaupload, some other customer might attract unwanted attention that spills over on you.
        More fundamentally, if my IT is supporting a million a year in income, I will treat it as a vital part of 1 million in value. If I outsource it to the cloud for $1000/year, it will be treated as a vital part of 1 thousand in value. If you aren't your provider's biggest customer,. your infrastructure as implemented in the cloud will never be treated with the importance you might give it yourself. Mos
    - Re: (Score:3)
      
      by jeremyp ( 130771 ) writes:
      
      BA does have its own redundant infrastructure.
      Whatever they are saying, this was not a simple hardware failure. Most likely something like somebody fubarred their internal DNS or something.
    - Re: (Score:2)
      
      by Hognoxious ( 631665 ) writes:
      
      I used to work for a transportation company. They had two mainframes, one in the midlands and one in the north. Any app was primary on one and secondary on the other.
      It wasn't instant failover but had one gone down operations would have been up within an hour or two[1], not down for two days.
      I've worked at smaller companies that had a similar setup with unix boxen.
      [1] No, most of the delay was *not* while we loaded boxes of punch cards onto stagecoaches.
    - Re: (Score:2)
      
      by herbierobinson ( 183222 ) writes:
      
      Hate to break this to you, but Amazon's cloud has been down more than the airline's systems have.
      At least one very large cloud provider doesn't even back up your data: The contract says they are not responsible for data loss.
      - Re: (Score:2)
        
        by AmiMoJo ( 196126 ) writes:
        
        Here's a list of times AWS had outages: https://en.wikipedia.org/wiki/... [wikipedia.org]
        Compares well to how often BA has computer problems. Also note that in ever instance they fixed it within an hour, not a few days like BA is taking.
        
        Re: (Score:2)
        
        by herbierobinson ( 183222 ) writes:
        
        The Wiki doesn't list the outage time, but at least one of them was way more than an hour.
        Also, the duration of the outage doesn't have to be really long to mess up an airline, because the recovery can take days.
    - Re: (Score:3)
      
      by Trogre ( 513942 ) writes:
      
      Going from a custom solution with outsourced staff to a cloud based one is just going from one shitstorm to another, but with the added bonus of relinquishing all control over how it is managed.
      If they actually cared about their data they would keep redundant datacentres with competent local staff.
      Clearly they don't, so lots of luck to them.
  - - Re: (Score:2)
      
      by Chris Mattern ( 191822 ) writes:
      
      Look at where many companies host their web sites and web store fronts. Yep, AWS, Azure, and others.
      You can do that because web sites and web store fronts can't possibly kill people if they fail.
- Re:Somewhere, an IT guy is crying (Score:5, Informative)
  
  by Dogtanian ( 588974 ) writes: on Saturday May 27, 2017 @05:21PM (#54499027) Homepage
  
  Somewhere, there is probably an IT guy who has been begging for the budget to upgrade some old machines, or move the services onto a cloud provider and was ignored.
  On the contrary, that IT guy was probably made redundant in 2016. As the BBC article [bbc.co.uk] notes:-
  The GMB union says this meltdown could have been avoided if BA had not made hundreds of IT staff redundant and outsourced their jobs to India at the end of last year.[..]
  "BA in 2016 made hundreds of dedicated and loyal IT staff redundant and outsourced the work to India... many viewed the company's actions as just plain greedy."
  Let's hope BA continues to reap as many "savings" from that outsourcing as they did today. :-)
  He's crying today.
  Going by the likely response of the laid-off employees [youtube.com] to the predicament of BA, I guess he *would* have tears coming out of his eyes.
  
  Parent Share
  twitter facebook
  - Re:Somewhere, an IT guy is crying (Score:5, Funny)
    
    by AK Marc ( 707885 ) writes: on Saturday May 27, 2017 @05:38PM (#54499103)
    
    Don't blame the IT workers. I had the same thing happen where I work. The middle manager, trying to look good, cut necessary costs. One power blip in the grid, and everything was dead because we had undersized UPSs everywhere, and they couldn't handle the load. He said "inrush current" thousands of times, but never knew what it meant.
    
    Parent Share
    twitter facebook
    - Re:Somewhere, an IT guy is crying (Score:4, Informative)
      
      by Dogtanian ( 588974 ) writes: on Saturday May 27, 2017 @06:39PM (#54499253) Homepage
      
      Don't blame the IT workers.
      Pretty sure I never did that.
      
      Any blame I'd put at the feet of whatever amoral rent-a-manager decided he could save a few pennies by ditching their established IT staff then contracting their jobs out to a third party company on the other side of the world.
      
      Let's face it, there *are* likely quite a few talented IT people from India- but as the Indians themselves have said, the good ones are probably working in other countries, or at least not for race-to-the-bottom contractors likely paying peanuts to staff with patchy educational skills. The contractor probably making a *very* nice profit on them- still appearing cheaper than the client's existing staff, while overselling their talent. (And I've no doubt that those employees are peons to whatever mediocre middle management the contractor has- and their circumstances in general- so it's questionable how much they're to blame personally).
      
      I've absolutely no doubt that the (apparent) ability to treat IT staff as a pure commodity is very appealing to such managers. At least until the shit hits the fan and it turns out that (surprise, surprise) it doesn't always work that way.
      
      Even if it was a power supply issue (and I'm pretty sceptical about that), it sounds like the resulting problems would still be a result of their penny-pinching sacking of the experienced staff most likely to know what they were doing (and be in a position to do it). It's sure as hell not their responsibility any more. And given how they were treated, they'd be perfectly entitled to feel schadenfreude at their former employer's travails.
      
      Parent Share
      twitter facebook
      - Re: (Score:2)
        
        by AK Marc ( 707885 ) writes:
        
        "I'm not blaming the IT workers" isn't true, when you follow it up with blaming the IT workers (contracted). The contracting doesn't mean sub-standard workers are used. It means the management is even more separated from those doing the work, so a contractor suggesting $100k in power upgrades, to be done by his outsource team is seen as a sales pitch, not a "do this or you lose $1B from a failed infrastructure". That's purely a management failure that's unrelated to contracting, location of the contracto
        
        Re: (Score:2)
        
        by Dogtanian ( 588974 ) writes:
        
        "I'm not blaming the IT workers" isn't true, when you follow it up with blaming the IT workers (contracted).
        If you go back and pay attention to what I was *actually* saying- rather than viewing things through the prism of your own personal experience- you'll see that the only thing the contracted workers were "blamed" for was for being employed on the basis of cost rather than skill, and for being treated as a commodity.
        
        In other words, it's quite obvious that it was primarily management- both at the contractor and at BA- that were being blamed here. Well, at least I thought it was obvious.
        
        Re: (Score:2)
        
        by AK Marc ( 707885 ) writes:
        
        the only thing the contracted workers were "blamed" for was for being employed on the basis of cost rather than skill,
        The implication is obviously that the skill was lower, or there'd be no need to mention it. They were picked on cost. Period. Mentioning skill weakens your claimed focus solely on management. Why bring up the skill of the workers if it's unrelated to the incident?
        
        The effect would have been the same if they had replaced the workers with young new-hires, cheap retirees, or any other group they intended to pay the minimum they could get away with and planned to ignore feedback from. The union has claimed
- Re: (Score:2)
  
  by Chris Mattern ( 191822 ) writes:
  
  Somewhere, there is probably an IT guy who has been begging for the budget to upgrade some old machines, or move the services onto a cloud provider and was ignored.
  He's crying today, because this huge revenue loss could probably have been avoided with a small budget for newer hardware or more redundancy.
  No, he's crying because he's been fired. Management, of course, decided it was all his fault.
- - Re:Somewhere, an IT guy is crying (Score:4, Informative)
    
    by fluffernutter ( 1411889 ) writes: on Saturday May 27, 2017 @09:50PM (#54499737)
    
    Stop it with the whole 'they should use cloud' nonsense.. First of all, most airlines are still on mainframes, and most of them get uptimes close to Amazon. From a mainframe. Can you believe it?
    
    Parent Share
    twitter facebook
  - Re: (Score:3)
    
    by jeremyp ( 130771 ) writes:
    
    There is no cloud, there's just somebody else's computer.
- - Re: (Score:2)
    
    by Bert64 ( 520050 ) writes:
    
    In the days of a 486 there was no such thing as a "core", it would be a "per physical cpu" license if anything...
    And you could always create a VM on any modern hardware which only exposed a single core to the guest OS.
Other sources: IT outsourcing (Score:2)

by sanf780 ( 4055211 ) writes:

It looks like BAE has recently replaced most of its IT workforce with south Asian contractors. It might or might not be related, as the official statement is power supply failure.
Here you have the BBC report on the matter: http://www.bbc.com/news/uk-400... [bbc.com]
- Re:Other sources: IT outsourcing (Score:5, Informative)
  
  by gilgongo ( 57446 ) writes: on Saturday May 27, 2017 @04:18PM (#54498827) Homepage Journal
  
  It looks like BAE has recently replaced most of its IT workforce with south Asian contractors.
  OT: it's BA, not BAE. The latter is a different company concerned mainly with blowing up flying objects, along with people in them. Easy mistake to make though.
  
  Parent Share
  twitter facebook
  - Re: (Score:2)
    
    by mjwx ( 966435 ) writes:
    
    It looks like BAE has recently replaced most of its IT workforce with south Asian contractors.
    OT: it's BA, not BAE. The latter is a different company concerned mainly with blowing up flying objects, along with people in them. Easy mistake to make though.
    Just to clarify the parents point, BA is British Airways, a commercial airline operating out of the United Kingdom. BAE Systems, formerly British Aerospace, is a defence contractor who primarily works for the UK Government and produces amongst other things, the Harrier jump jet, the UK's Eurofighter Typhoons, M2/M3 Bradleys (via an acquisition of United Defence), Astute class submarines and Type 45 destroyers.
    
    Despite BAE's impressive work it's the service in on BA flights that truly strikes fear into the
- Re:Other sources: IT outsourcing (Score:5, Insightful)
  
  by gweihir ( 88907 ) writes: on Saturday May 27, 2017 @04:22PM (#54498837)
  
  "Power supply failure" does not take down a well-designed and well-maintained infrastructure. This is just a smokescreen to hide incompetence.
  
  Parent Share
  twitter facebook
  - - Re: (Score:2)
      
      by gweihir ( 88907 ) writes:
      
      It does not really matter. Something this critical done right is done in two geographically separate redundant data-centers with independent power sources (different power stations and power-lines) and enough independent local power to survive long enough for anything except a major catastrophe being fixed by the power people. I have seen several instances of this being done. What happened here seems to have been a minor incident that got major because of lack of redundancy and preparedness. A major inciden
Idiots in charge! (Score:3, Insightful)

by Anonymous Coward writes: on Saturday May 27, 2017 @03:57PM (#54498755)

The Mythical Man-Month [wikipedia.org] was written in 1975. In a very detailed way, it described how common business-planning stategies fail when applied to information technology projects. But did anyone listen? We've known how to avoid these sorts of problems for over 40 years!

Share
twitter facebook
- Re:Idiots in charge! (Score:5, Insightful)
  
  by gweihir ( 88907 ) writes: on Saturday May 27, 2017 @04:21PM (#54498833)
  
  "We" (as in people that actually have a clue what they are doing) have indeed known that. But the decision-makers have no such understanding. While it is really tacky, I have had to explain catastrophe scenarios to customers that would have killed their company, and all that was needed was a failed software functionality update (which they wanted to do without a possibility to roll-back and no working plan for keeping business going any other way). The people making the decisions these days are bean-counters with zero understanding of risk-management or "visionaries" that have even less of an understanding about the reality of things. And, unfortunately, this often is aided by a corporate culture of "don't rock the boat" and people that warn of consequences get silenced.
  Expect more of these utterly pathetic failures.
  
  Parent Share
  twitter facebook
  - Re: (Score:1)
    
    by Anonymous Coward writes:
    
    1. BA offshores IT
    2. UK has bank holiday weekend (3 days)
    3. Upgrade / new release cockup ==> systems down
    4. No plan B.
  - Re: (Score:3)
    
    by dwywit ( 1109409 ) writes:
    
    I think it got much worse when accountants started calling themselves "management consultants", as if expertise in managing finance somehow magically transferred to all aspects of management.
    Instead, the balance sheet became the be-all and end-all for decision-making.
    This is British Airways, one of the largest commercial airlines..... in the world /clarkson. They've had a pretty robust system for a long time. I find it hard to believe that there were sufficient failures of multiple systems to lose power.
  - Re: (Score:2)
    
    by hughbar ( 579555 ) writes:
    
    Thanks. So agree. I'm semi-retired now, but my last serious job was in a small, very competent investment bank. Any major change had to survive the change committee and have a detailed, well-documented roll-back plan. We usually did the doing in the middle of the night at the weekend when major markets were closed as well. Also, senior managers were somewhat tech-savvy and very supportive.
    
    This BA thing isn't the first one either: https://www.theguardian.com/mo... [theguardian.com] and probably won't be the last. Because t
  - Re: (Score:2)
    
    by Mostly a lurker ( 634878 ) writes:
    
    I have a lot of sympathy with what you write. However, most large organizations these days do take disaster recover and business continuity somewhat seriously, and have budgets to ensure issues like this are supposed to be addressed.
    As others have opined, while some kind of power outage might have precipitated the problems, surely there must have been a cascading series of events that prevented the system from coming back online quickly. Of course, a power outage should not bring down critical servers in
Backup plan.. (Score:2)

by DeBaas ( 470886 ) writes:

and the backup plan for when the IT systems fail is: water and food vouchers..
- Re:Backup plan.. (Score:5, Interesting)
  
  by Anonymous Coward writes: on Saturday May 27, 2017 @06:29PM (#54499225)
  
  Actual BA passenger here, currently in Austin TX, and was due to fly to LHR on today's direct flight at 6pm central time. Just to highlight how catastrophic the failure is:
  - Heard about the outage this morning, and looked online for more information, very little actual info available. I logged into BA with my flight booking, and the page indicated that the flight was still fine. The system also had my email address and made the statement "we will contact you if there are any problems".
  - Based on this I assumed the flight was OK.
  - Turned up at the airport and the BA check-in is closed. There was a large crowd of unhappy people, a haggard team of BA staff behind the counter, but no one was moving and nothing was happening. After 20 minutes I went and told the BA manager that he had better tell the crowd what is happening before things get out of hand. Eventually, he did redeem himself by doing a walk-through and chatting with people and handing out a letter explaining that the flight was canceled.
  - Not only was the flight canceled, but their systems were unable to do any rescheduling. They asked us to leave the airport, find a hotel, contact them tomorrow, and ultimately seek reimbursement for expenses.
  - Disappointed, I wandered down to American Airways (a One World partner, with whom I am saphire) and had a chat with their staff. As if by magic, they somehow pulled my booking from the BA system and put me on some AA flights free of charge. Amazing.
  Not sure how much of it is staff incompetence, or the system is just completely fucked, but this mess is going to take days to resolve...as for me, I'm off in a few minutes, best of luck to the other BA passengers caught up in this mess!
  
  Parent Share
  twitter facebook
  - that is because your booking us in gds (Score:2)
    
    by aepervius ( 535155 ) writes:
    
    Airline system are usually split in various intercommuniticating system (be their own, or in case of gds external enormous firms). E.g. you have a crew system, a weight and balance system, a check in system, a baggage system, and a reservstoon system usually handled by a crs, like appolo, axsres, amadeus, galileo, infiny , etc... your booking was almost certainly saved in obe of those gds. And depending on the agreements among airline and interline set up, they can just pull each other booking (aka PNR pass
Is anyone tracking causes for Airline outages? (Score:3)

by david.emery ( 127135 ) writes: on Saturday May 27, 2017 @04:03PM (#54498779)

It's my vague recollection that at least one other airline had a power-related IT outage within the last year or so.
I would have thought "reliable power at scale" was a solved problem.

Share
twitter facebook
- Re: (Score:3)
  
  by gweihir ( 88907 ) writes:
  
  There are no "power related" IT outages. There are some where the IT infrastructure could not handle one specific system going down, and that is not a technical issue, but something else which usually is called "gross negligence". The seeming technological root-causes are just transparent lies by misdirection that serve to obscure the fact that management caused this by incompetence, arrogance, greed and general stupidity.
  - Re:Is anyone tracking causes for Airline outages? (Score:5, Interesting)
    
    by __aaclcg7560 ( 824291 ) writes: on Saturday May 27, 2017 @04:50PM (#54498939)
    
    There are some where the IT infrastructure could not handle one specific system going down, and that is not a technical issue, but something else which usually is called "gross negligence".
    Technically, that's known as a single point of failure.
    https://en.wikipedia.org/wiki/Single_point_of_failure [wikipedia.org]
    The term "gross negligence" doesn't come into play until a lawsuit is filed. Since no one died and/or injured from this outage, a gross inconvenience doesn't rise to gross negligence.
    
    Parent Share
    twitter facebook
  - - Re: (Score:2)
      
      by gweihir ( 88907 ) writes:
      
      BA is not the only incompetent ones on this planet. The root cause is not the power-problem.
  - - Re: (Score:2)
      
      by gweihir ( 88907 ) writes:
      
      Funny. You are as arrogant as you are clueless. Probably words like "geo-redundancy" are too long for you. This is not your amateur home installation.
- Re: (Score:3)
  
  by account_deleted ( 4530225 ) writes:
  
  Comment removed based on user account deletion
The major issue is... (Score:5, Funny)

by Anonymous Coward writes: on Saturday May 27, 2017 @04:03PM (#54498781)

...the outsourced IT guys from TCS in India need to fly to the UK to fix the 'power supply' issue but currently they are unable to book a flight on British Airways.....

Share
twitter facebook
- Re: The major issue is... (Score:1)
  
  by Solo-Malee ( 618168 ) writes:
  
  Mod up +1 funny :-)
- Re:The major issue is... (Score:5, Insightful)
  
  by Anonymous Coward writes: on Saturday May 27, 2017 @05:02PM (#54498971)
  
  Funny, but the bigger issue is there anyone at Tata that was there the last time BA restarted their systems? At the bank I used to work at, we were replaced by contractors, and two years later when they restarted the zSystem, they found-out the hard way that no one knew what to do.
  
  Parent Share
  twitter facebook
Pilling up technical debt is utterly stupid (Score:5, Insightful)

by gweihir ( 88907 ) writes: on Saturday May 27, 2017 @04:11PM (#54498793)

Of course, it requires more than the myopic 3-month planning that most MBAs are capable of at maximum. It also requires a real understanding of risk management and staying away from all short-term optimization. Otherwise, you end up at "save a million, lose a billion", as this seems to be a fine example of.
Claiming this was a "power supply issue" is just lying by misdirection. The root cause is lack of redundancy, lack of resilience and lack of effective business continuity management. All things that cost money and that do not generate profit _unless_ something like this happens. In a healthy infrastructure, one (or even several) power supplies blowing up will not kill your ability to do business.
Events like that are almost universally due to gross mismanagement and should not only result in termination but also prosecution of the "leadership" that allowed this to happen by not being prepared.

Share
twitter facebook
- Re: (Score:2)
  
  by Areyoukiddingme ( 1289470 ) writes:
  
  Of course, it requires more than the myopic 3-month planning that most MBAs are capable of at maximum. It also requires a real understanding of risk management and staying away from all short-term optimization. Otherwise, you end up at "save a million, lose a billion", as this seems to be a fine example of.
  Claiming this was a "power supply issue" is just lying by misdirection. The root cause is lack of redundancy, lack of resilience and lack of effective business continuity management. All things that cost money and that do not generate profit _unless_ something like this happens. In a healthy infrastructure, one (or even several) power supplies blowing up will not kill your ability to do business.
  Events like that are almost universally due to gross mismanagement and should not only result in termination but also prosecution of the "leadership" that allowed this to happen by not being prepared.
  That's only going to happen if you can teach the Mouth Breathing Assholes (that's what MBA stands for, right?) who are hedge fund, 'wealth management', and other institutional investor representatives who are the only shareholders who matter that all their wonderfully insightful financial questions during the quarterly shareholder call are completely pointless if no one is paying attention to the fundamentals of the business—and functioning information technology is rather obviously a fundamental of t
- Re: (Score:2)
  
  by thegarbz ( 1787294 ) writes:
  
  Otherwise, you end up at "save a million, lose a billion", as this seems to be a fine example of.
  They are currently one of the most profitable airlines I'm sure they'll be fine.
  Claiming this was a "power supply issue" is just lying by misdirection
  No. Claiming it was a power supply issue is a mix of stating the initiating event without having done the root cause analysis + not confusing people in general technobabble that is of no interest to them. This is neither lying or misdirection, and the higher up the chain or the further out into the public these kinds of messages to the more they get simplified.
  The root cause is
  The root cause is something only a complete idiot who doesn't know an
  - Re: (Score:2)
    
    by gweihir ( 88907 ) writes:
    
    Stick to the armchair and stay away from incident management please.
    Funny. Our customers (whose beacons we have saved a few times) think differently. It is pretty clear who the amateur here is.
    - Re: (Score:2)
      
      by thegarbz ( 1787294 ) writes:
      
      Funny. Our customers (whose beacons we have saved a few times) think differently.
      Ain't nothing like an unsubstantiated appeal to authority in an attempt to save a very weak argument demonstrating what little knowledge you had on the topic.
      It is pretty clear who the amateur here is.
      That we can agree on, though I'm not sure you're going to like the reason why.
      - Re: (Score:2)
        
        by gweihir ( 88907 ) writes:
        
        Funny. Our customers (whose beacons we have saved a few times) think differently.
        Ain't nothing like an unsubstantiated appeal to authority in an attempt to save a very weak argument demonstrating what little knowledge you had on the topic.
        You mistake my purpose. I am not trying to convince you, your opinion is utterly worthless to me as you demonstrated that you have nothing worthwhile to contribute.
        
        Re: (Score:2)
        
        by thegarbz ( 1787294 ) writes:
        
        Only the reasons why you're wrong and shouldn't be part of RFCAs.
- - Re: (Score:2)
    
    by __aaclcg7560 ( 824291 ) writes:
    
    Their mode of operation is if its not broken or onfire, ignore it until it does catch fire. What puzzles me is the network ops manager and their manager think this team is magic because they solve amazing problems all the time, [...]
    The trick is to show up with a fire extinguisher and a replacement server just before the server halt and catches fire. If everyone did preventative maintenance and nothing catches on fire, management would start laying off techs.
    - Re: (Score:2)
      
      by gweihir ( 88907 ) writes:
      
      The trick is to show up with a fire extinguisher and a replacement server just before the server halt and catches fire.
      If everyone did preventative maintenance and nothing catches on fire, management would start laying off techs.
      Unfortunately, there is a lot of truth in that. A friend of mine had his company lay off almost all system administrators because the IT worked fine ("So what do we need them for?"). Of course, the company went bankrupt as a result of that about 2 years later. The amount of stupidity displayed in these actions is really staggering.
    - - Re: (Score:2)
        
        by __aaclcg7560 ( 824291 ) writes:
        
        "The trick is to fail to do your job until shit's about to blow up. Or sometimes, after shit has blow up, so management thinks you're super critical."
        Yeah, that's real miracle work, creimer.
        No, that's a shitty way to run an IT department. The miracle work is cleaning up the operations to have it conform to enterprise standards so everyone and everything is an interchangeable cog.
        I wish I worked at the same place as you - in about 2 months, I'd completely automate away your entire reason for existing.
        Not sure why you want to automate a job that will disappear when the contract ends. As an IT support contractor, I'm here today and gone tomorrow.
        
        Re: (Score:2)
        
        by __aaclcg7560 ( 824291 ) writes:
        
        Not to mention the last place he worked at, he farted after lunch and this happened.
        Lame, lame, lame. This is better.
        https://www.youtube.com/watch?v=LOS5gWzkXMA [youtube.com]
  - Re: (Score:2)
    
    by dbIII ( 701233 ) writes:
    
    They constantly have weird break downs and people screwing things up
    That's par for the course in IT especially with what appears to be very little QA testing before new releases and patches of various software. However the weird breakdowns and screwups are not supposed to impact on production. There is supposed to be some way to fall back before the users even notice. That does require some sort of budget or at least retention of machines replaced by relatively recent upgrades.
    Cut to the bone and those f
  - Re: (Score:2)
    
    by gweihir ( 88907 ) writes:
    
    I think your network ops manager might suffer from Stockholm Syndrome.
    Alternatively, he might just be the guilty party here and is trying to do his best to prevent anybody from noticing. I have seen that one in action before.
- - Re: (Score:2)
    
    by gweihir ( 88907 ) writes:
    
    Ever heard of things like "due diligence" and "due care"? Not doing them while being in charge _is_ a crime. People have been sent to prison for it.
    Messing up when your job is to make sure there is no messing up is a crime, unless you did everything reasonably possible and then were hit by really bad luck. That is obviously not the case here.
Maybe in bringing it back up they can... (Score:5, Funny)

by h4x0t ( 1245872 ) writes: on Saturday May 27, 2017 @05:37PM (#54499095) Homepage

.. find my fucking bag that they lost A WEEK AGO, the fucking fucks.

*cough*

Share
twitter facebook
Obligatory Bastard Operator from Hell (Score:5, Funny)

by UnknowingFool ( 672806 ) writes: on Saturday May 27, 2017 @08:43PM (#54499587)

"No the server isn't down. You must be using it wrong, idiot." *unplugs coffee maker, plugs server back in*

Share
twitter facebook
Either amature hour or a lie (Score:5, Insightful)

by Murdoch5 ( 1563847 ) writes: on Saturday May 27, 2017 @11:50PM (#54500079) Homepage

Massive world wide systems like this, should always have at least two entire working deployments, one kept in a down state and one kept up and working, that way if a problem happens, you just bring the second data center online and off you go.

If a power supply issue could bring down your entire system, you didn't design it correctly, PERIOD! If your entire system hinges on a single power supply failure, you ALWAYS have a second one on an alternative supply, in fact, you'd have multiple supplies to each data center, from different providers, just to make sure power issues can't cause these types of issue.

If the problem really comes down to a power supply, fire the IT department, fire the System Architects and start doing things properly.

Share
twitter facebook
- Re:Either amature hour or a lie (Score:4, Insightful)
  
  by thegarbz ( 1787294 ) writes: on Sunday May 28, 2017 @06:12AM (#54500701)
  
  Define "correctly". We can design and build things for any scenario. With unlimited money, and investors who don't care about a profitable business we can do ANYTHING. Blanket statements get you nowhere.
  Give us up-time numbers, design goals, costs of failure, associated profits. Will BA report on their balance sheet a loss larger than the cost of hardening their entire infrastructure? Tune in on the 31st December this year to see how little designing things "correctly" matters in the corporate report.
  
  Parent Share
  twitter facebook
  - Re: (Score:2)
    
    by Murdoch5 ( 1563847 ) writes:
    
    Correctly in IT / System Architecture, means you have enough redundancy in your design so as not to have single point or even double points of failure. If management doesn't want to spring for a properly designed network, after it's been laid out / designed on paper, then they can swallow the massive losses of having the entire stack go down. Although I would suspect that after this joke, they'll be more willing to do the job right the first time.
    - Re: (Score:2)
      
      by thegarbz ( 1787294 ) writes:
      
      Although I would suspect that after this joke, they'll be more willing to do the job right the first time
      "THIS time it will be different!" - Something that has been said every time. I mean it's not like it's the first time airplanes were grounded due to IT issues. If they haven't taken it seriously in the past what makes you think it will change now?
- Re:Manual backups (Score:5, Interesting)
  
  by __aaclcg7560 ( 824291 ) writes: on Saturday May 27, 2017 @05:02PM (#54498967)
  
  If you're going to have people fallback on pen and paper, they need to be trained to use pen and paper. I worked at a restaurant when a power outage took down the ordering stations. The restaurant kept doing business until the power came back online an hour later, as sunlight through the large windows and emergency lighting illuminated the interior. The kitchen kept on cooking with gas-powered appliances and emergency lights. The wait staff struggled to calculate bills and make change with only one calculator in the entire building. Management added backup power to the ordering stations a week later.
  
  Parent Share
  twitter facebook
- Re: (Score:2)
  
  by CrazyBusError ( 530694 ) writes:
  
  What a lovely piece of racist bullshit.
  
  Shame you don't have the room or time to make a list of the companies ruined by white Americans and the amounts of money involved. (Here's a hint: One of the culprits is currently sat in the White House)

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

a power supply failure?? (Score:5, Insightful)

Re: (Score:3, Funny)

Re:a power supply failure?? (Score:5, Funny)

Re: (Score:2)

Re: (Score:2)

Busy U.K. Holiday Weekend... (Score:2)

Re: (Score:3)

Re: (Score:2, Funny)

Re: (Score:2)

Re: Busy U.K. Holiday Weekend... (Score:2)

Re:Busy U.K. Holiday Weekend... (Score:5, Informative)

Re: (Score:2)

Re: (Score:1)

Re: (Score:2, Insightful)

Re: (Score:2)

Re: (Score:2)

outsourcing (Score:2, Insightful)

Re: (Score:2)

Re: (Score:2)

Re: (Score:3)

Re: (Score:2)

Somewhere, an IT guy is crying (Score:5, Insightful)

Re:Somewhere, an IT guy is crying (Score:5, Insightful)

Re:Somewhere, an IT guy is crying (Score:5, Insightful)

"Cloud" is used to encourage cloudy thinking. (Score:2)

Re:Somewhere, an IT guy is crying (Score:5, Insightful)

Re:Somewhere, an IT guy is crying (Score:5, Insightful)

Re: Somewhere, an IT guy is crying (Score:2, Insightful)

Re: Somewhere, an IT guy is crying (Score:5, Insightful)

Re: (Score:2)

Re: (Score:3)

Re: (Score:2)

Re: (Score:2)

Slashdot - land of one liners (Score:2)

Re: (Score:2)

Re: (Score:2)

Re:Somewhere, an IT guy is crying (Score:4, Interesting)

Re: (Score:2)

Re: (Score:3)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:3)

Re: (Score:2)

Re:Somewhere, an IT guy is crying (Score:5, Informative)

Re:Somewhere, an IT guy is crying (Score:5, Funny)

Re:Somewhere, an IT guy is crying (Score:4, Informative)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re:Somewhere, an IT guy is crying (Score:4, Informative)

Re: (Score:3)

Re: (Score:2)

Other sources: IT outsourcing (Score:2)

Re:Other sources: IT outsourcing (Score:5, Informative)

Re: (Score:2)

Re:Other sources: IT outsourcing (Score:5, Insightful)

Re: (Score:2)

Idiots in charge! (Score:3, Insightful)

Re:Idiots in charge! (Score:5, Insightful)

Re: (Score:1)

Re: (Score:3)

Re: (Score:2)

Re: (Score:2)

Backup plan.. (Score:2)

Re:Backup plan.. (Score:5, Interesting)

that is because your booking us in gds (Score:2)

Is anyone tracking causes for Airline outages? (Score:3)

Re: (Score:3)

Re:Is anyone tracking causes for Airline outages? (Score:5, Interesting)

Re: (Score:2)

Re: (Score:2)

Re: (Score:3)

The major issue is... (Score:5, Funny)

Re: The major issue is... (Score:1)

Re:The major issue is... (Score:5, Insightful)

Pilling up technical debt is utterly stupid (Score:5, Insightful)

Re: (Score:2)