US Opens Investigation Into Delta After Airline Cancels Thousands of Flights 48
The US transportation department said on Tuesday it was opening an investigation into Delta Air Lines after the carrier canceled more than 5,000 flights since Friday as it struggles to recover from a global cyber outage that snarled airlines worldwide. From a report: While other carriers have been able to resume normal operations, Delta has continued to cancel hundreds of flights daily because of problems with its crew scheduling system. Since Friday Delta has been cancelling 30% or more of its flights daily through Monday, axing 444 flights on Tuesday, or 12% of its schedule as of 11.00am and delaying another 590, or 16%, according to FlightAware, after cancelling 1,150 on Monday.
The transportation secretary, Pete Buttigieg, said on Tuesday the investigation was to "ensure the airline is following the law and taking care of its passengers during continued widespread disruptions ... Our department will leverage the full extent of our investigative and enforcement power to ensure the rights of Delta's passengers are upheld." Delta said it was in receipt of the USDOT notice of investigation and was fully cooperating. "Delta teams are working tirelessly to care for and make it right for customers impacted by delays and cancellations as we work to restore the reliable, on-time service they have come to expect from Delta," the airline said.
The transportation secretary, Pete Buttigieg, said on Tuesday the investigation was to "ensure the airline is following the law and taking care of its passengers during continued widespread disruptions ... Our department will leverage the full extent of our investigative and enforcement power to ensure the rights of Delta's passengers are upheld." Delta said it was in receipt of the USDOT notice of investigation and was fully cooperating. "Delta teams are working tirelessly to care for and make it right for customers impacted by delays and cancellations as we work to restore the reliable, on-time service they have come to expect from Delta," the airline said.
Bitlocker (Score:3)
News previously reported that Delta writes down Bitlocker recovery keys in a binder and has to type them in at each workstation to remove the .sys files manually. Assuming fidelity in both directions, I presume?
40 character codes, reportedly.
Nominative Determinism (Score:2)
Re: (Score:2)
If you have to boot into recovery, how else than manually would you enter the bitlocker recovery key?
If they keep the keys in a binder, it's certainly going to be hard to get the codes to each individual of course. If it's a question of VMs, I'm unsure how this concept is any worse than having the keys digitally. After all, who activates copy paste into a VM?
Re: (Score:2)
I think the problem is Delta has thousands of machines. Everything from signage displays to customer service.
Re: (Score:3)
I think the problem is Delta has thousands of machines. Everything from signage displays to customer service.
You forgot to add this to what you said:
And they all run Windows.
If they had had a mixed OS environment where they had, say, Windows and Linux, at least some of what they had would be working OK. Now, they're going to have to sue Crowdstrike because their single OS environment blew up because Crowdstrike botched an update. There's something to be said for not having a single point of failure.
Re: (Score:2)
If they had had a mixed OS environment where they had, say, Windows and Linux, at least some of what they had would be working OK.
But now you have the overhead of having to maintain two different operating systems.
I don't think this is an OS issue, or even a vendor issue. It's a management issue. These things are running kiosks and ticketing workstations that run one app off of a server. Why do you need OS and virus updates? You wall them off from the internet and steady state them.
Re: (Score:1)
It's an overly-burdensome compliance issue.
Espect government IT safety inspections (Score:2)
Conjecture: The government will at some point require an ever expanding set of industries to comply minimal standards of software and hardware.
They're doing it with the defense and energy departments and will eventually force that on strategic industries (refineries, pipelines, electrical grid, trains, airlines, ports, airports, terminals, ...).
Expect there to be a minimal baseline set for any technology, hardware, or VM image used in the cloud for a government purpose.
Re: (Score:2)
If you have to boot into recovery, how else than manually would you enter the bitlocker recovery key?
Stored on a USB key.
Or better yet, push a known good image using remote management tool, which they apparently don't use.
Re: (Score:2)
So fun fact, microsoft actively makes it hard to save the key at setup time, as it wont let you save it to hdd on local machine. So unless you physically go to each machine and insert a thumb drive while setting up, its basically as much work to do this before hand as it is to type out manually. Someone has to compile it all the text files and name them and put them in one place, load up the thumbdrive with the key specific to the machine... blah blah. not much time savings.
also Most of the time imaging sof
Re: (Score:2)
So fun fact, microsoft actively makes it hard to save the key at setup time, as it wont let you save it to hdd on local machine.
They do for home users. Corporate users can push recovery keys to an active domain hive. There's a powershell command to do it for you.
https://learn.microsoft.com/en... [microsoft.com]
From there you can export all of them to a USB key.
also Most of the time imaging software is incompatible with antivirus / security. they don't recommend you do this. as it can CAUSE BSoD if you restore an image with 3rd party antimalware active. so probably would not have this problem at all if they kept images as they probably would not have cloudstrike lol.
My friend manages an entire university's worth of virtualized servers and desktops. It's all VMware based, and they all run Crowdstrike. It all went down because of Crowdstrike, but recovering meant pushing the last known good images out and shutting down the crowdstrike auto-updater via grou
Re: (Score:1)
If you have to boot into recovery, how else than manually would you enter the bitlocker recovery key?
In a well-designed, modern system, by pushing them over ssh. Windows is not a well designed or modern system.
Re: (Score:2)
and then what? type it out by hand into your ssh console?
there is 2 wrongs here:
1. implementing bitlocker to non portable and non essential systems such as kiosks. It shouldnt matter if the data on them is stolen, it should be ephemeral. But insurance companies push these mandates regardless of reason or logic
2. only having a printed copy. lol what? i would think you at least would save them in a text file, but the security team probably though that was too much risk as someone could steal them easier. like
Re:Bitlocker (Score:5, Insightful)
News previously reported that Delta writes down Bitlocker recovery keys in a binder and has to type them in at each workstation to remove the .sys files manually. Assuming fidelity in both directions, I presume?
40 character codes, reportedly.
That's how BitLocker is supposed to work. The keys must be somewhere secure, usually a binder or thumb drive (and copies of them) under lock. It forces people to lock things down.
It's a perfect measure for securing hardware (sometimes it is a legal requirement, depending on the industry.)
The problem is that CrowdStrike literally bypassed Microsoft's security certification requirements for its driver by pushing uncertified p-code as an update (not configuration data files, but actual executable payloads that were never tested as required for kernel driver certification.)
And that's just the start of the snafu (creating a loophole to push uncertified, unsafe code.)
The second part of the snafu is that CrowdStrike didn't test it (because the bug is immediately apparent by bricking wherever the change is deployed.) From what I've read, they rely heavily (and brainlessly) on automated tests for "testing."
If one removes a human element from validation and verification, eventually an organization optimize testing to "pass the test scripts." I'm 99.999% sure that's what happened (or some permutation thereof.)
The third part of the snafu is the sheer idiocy of doing a mass, global rollout all at once.
Deployment 101: Do slow roll-outs. Roll out to internal machines that resemble customer environments (ideally managed by a 3rd party separate from development and testing teams.) Observe till green lights are on before moving to the next phase.
Then pick the start of the week for the roll-out (never on a Friday because customers might not have staff available to roll shit back.)
Then have an action item list for deployments and for emergency rollbacks. Then rehearse the steps on that action item list. Then execute on a deployment specifically created to test a roll back. Observe till all green lights are on before moving to the next phase.
Then roll out to select customers. Observe till green lights are on before moving to the next phase.
Then roll out to by regions or industries that are less critical than others (PoS systems before airlines before hospitals and life-critical systems.). Observe till green lights are on before moving to the next phase.
None of that happened. CrowdStrike is run by overworked amateurs or cowboys. This affected hospitals, so I will be surprised if no one died, got injured or was prevented from getting treatment.
Amateur hour cannot be acceptable for critical systems of such magnitude. Companies that want to play in such fields need to take it seriously, or just stay making phone apps or something.
Re: (Score:2)
Agreed. It should be less difficult to tell working-in-the-garage amateurs from amateurs with a $68 Billion market cap.
Re: (Score:3)
This is an industry problem. Everyone has forgot all about 'Availability' about the only part of CIA that gets any attention now is 'Confidentiality' and maybe maybe some operations people occasionally get a CIO type to listen to them about 'Integrity' by make some fraud prevention pitch to a disinterested internal security organization.
The info-sec industry has responded and the response is: speed speed speed; mixed with buy our anti-hacker spray and all your problems will be gone.
A lot of people have cor
Re: (Score:2)
I agree. Sure they can say it's there to protect from attacks in progress, but it has direct access and hence already inherently more dangerous than most attacks. The analogy should maybe be with medicine: first, do no harm.
The kernel is responsible for security. If loading code into the kernel is the answer then there's insufficient design and capability in the kernel. Develop a more secure kernel which has more powerful auditing.
Re: (Score:2)
One report I saw was that the bad update file contained all zeros. So not even basic sanity checking before deploying.
Re: (Score:2)
That has got to be an alternative fact. What I read is they had an invalid pointer in a jump-table.
Re: (Score:2)
Not necessarily. 0x00000000 IS an invalid pointer, after all.
Re:Bitlocker (Score:5, Informative)
The root problem that caused the BSoD wasn't even related to dynamic code, it was a bug in the signed driver itself as it read a null pointer from the update file and dereferenced it without doing any validation. Maybe it would have run dynamic code from the file later, not sure, but the crash happens before that point.
Microsoft should introduce fuzzing of data files as a mandatory requirement for WHQL certification. If a driver can't fail gracefully if its data files are corrupt or otherwise broken it shouldn't be WHQL certified..
Re: (Score:2)
That rollout process sounds great for new software. Not for channel updates on cybersecurity suites. You don't sit around waiting to push potential solutions to active zero-day exploits. There's a reason these often get pushed out quick. That said that's just on your last point, - the global rollout piece - not an excuse for not internally properly testing the change.
Re: Bitlocker (Score:2)
Re: (Score:2)
That rollout process sounds great for new software. Not for channel updates on cybersecurity suites. You don't sit around waiting to push potential solutions to active zero-day exploits. There's a reason these often get pushed out quick. That said that's just on your last point, - the global rollout piece - not an excuse for not internally properly testing the change.
This is specifically designed for logical/code changes for cybersecurity suites. The only exceptions are for data/configuration changes or for hot deployments for emergency patches (think patching a zero-day exploit.)
This supposed update by CrowdStrike wasn't a config change, nor a hot/emergency deployment, but what was supposed to be a routine update.
I could understand if this was a zero-day exploit fix, but it wasn't.
Re: (Score:2)
Amateur hour cannot be acceptable for critical systems of such magnitude.
Capital says otherwise.
Companies that want to play in such fields need to take it seriously, or just stay making phone apps or something.
You do not have Capital. You can speak to the wind all you want, it will change nothing.
Re: (Score:1)
Re: (Score:2)
The hospital my wife works at had over 20,000 workstations affected. They called in all hands, even inexperienced managers. Only 5 people had access to the bitlocker keys. Those 5 were stationed on a Teams chat channel. Everyone else had to go to a workstation and indicate the machine they needed a key for in the channel. This slowed things down due to typing the key manually, but manpower compensated. At the peak they were fixing 800 machines an hour. By the end of the day they had all their workstations b
Cancellations (Score:1)
Re: (Score:2)
I don't understand why they would want to cancel flights. That makes them liable for a lot of compensation, over and above having to provide a flight anyway, just at a different time. Unless the plane was going to be nearly empty, that seems like a negative.
Re:Cancellations (Score:4, Insightful)
Re:Cancellations (Score:4, Interesting)
Doubtful that they'll pay much, if any, compensation for this. And then that compensation will only be to people who have the time and energy to follow through to obtain it.
Depends on the destination. For flights landing in Europe it will be 600EUR per passenger if cancelled within 14 days of departure (yes US citizens can claim this too). The regulator will also issue quite hefty fines if the compensation options aren't made clear to the passenger and if the form is too difficult to fill out. Most airlines require literally only 3 fields filled out to get the claim started.
That's on top of the other issues I mentioned in my other reply.
Re: (Score:2)
Lawful Masses, an actual lawyer, disagrees. See his video on YouTube, but he seems to think that the negligence here is enough for lawsuits to succeed against Crowdstrike. Nothing in the EULA will save them.
Re: (Score:2)
They don't want to cancel flights.
They have to because they can't get crews where they need to be to keep those flights scheduled. This is the same multi-day fuck-up that Southwest had to deal with after their systems were beshitted a while back.
Clearly they didn't learn from others' mistakes.
Re: (Score:2)
Yeah, that's what I was thinking. Delta has optimized their routes heavily enough that the chain reactions are massive. But OP seemed to think otherwise.
Re: (Score:2)
My experience is that airlines will use any excuse to cancel flights, whether legitimate or not. It doesn't seem outside of the realm of possibilities that they're slow-walking fixes, regardless of recovery method.
No, your experience is just biased because it happens to you. There's literally zero benefit to an airline for cancelling a flight. It screws up their roster, causes problems for rebooking customers, depending on destinations will require paying out customer fines, causes unpredictable location of planes with knock on effects to other flights, throws out predictable maintenance schedules, all the while they are required to pay for the take-off and landing slots at airports again, and god forbid they play ca
Re: (Score:2)
Yeah, I'm sure they're purposefully slow-walking fixes because they want increased regulatory scrutiny and a televised curbstomping of the CEO from Congress critters because they left tens of thousands of passengers stranded in places they don't want to be.
Or they could just be incompetent. Which is more likely?
Re: (Score:2)
Airlines lose money every minute an aircraft is on the ground, idle.
Thats why there are requirements in the industry to not allow air crew to work more then a certain number of hours daily, and with proper rest time requirements.
Otherwise the airlines will want to push the air crew to fly till they are tired and unsafe.
So no, I doubt they use any excuse to cancel flights - not only are they on the hook for compensation, they are also losing money by having the aircraft idle.
Oof. (Score:2)
Re:Oof. (Score:5, Insightful)
No one takes responsibility for their own failures. Corporations 101. You might want to look up authoritarian before throwing around words you don't understand.
Re:Oof. (Score:5, Interesting)
Re: (Score:2)
"Authoritarianism is all about passing the buck."
Even if we stipulate that, it does not follow that passing the buck is all about authoritarianism.
In addition, in your example you do not describe any "passing the buck".
Re: (Score:2)
Not just financial cost, but also moral cost. Hence the extreme hypocrisy and ritualistic scapegoating that features as a defining characteristic. Elites of an authoritarian state play musical chairs, dancing around each other committing the same crimes, but when the music stops - i.e., when a bill periodically comes due - they stop the music and make the wea
Crowdstrike thanks Congress (Score:3)
Meanwhile, Southwest is releaved (Score:2)
They're thinking about how glad they are that it's not *them* melting down this time.
Ugh, CrowdStrike duh !!! (Score:2)