After Crowdstrike Outage, FSF Argues There's a Better Way Forward (fsf.org) 139
"As free software activists, we ought to take the opportunity to look at the situation and see how things could have gone differently," writes FSF campaigns manager Greg Farough:
Let's be clear: in principle, there is nothing ethically wrong with automatic updates so long as the user has made an informed choice to receive them... Although we can understand how the situation developed, one wonders how wise it is for so many critical services around the world to hedge their bets on a single distribution of a single operating system made by a single stupefyingly predatory monopoly in Redmond, Washington. Instead, we can imagine a more horizontal structure, where this airline and this public library are using different versions of GNU/Linux, each with their own security teams and on different versions of the Linux(-libre) kernel...
As of our writing, we've been unable to ascertain just how much access to the Windows kernel source code Microsoft granted to CrowdStrike engineers. (For another thing, the root cause of the problem appears to have been an error in a configuration file.) But this being the free software movement, we could guarantee that all security engineers and all stakeholders could have equal access to the source code, proving the old adage that "with enough eyes, all bugs are shallow." There is no good reason to withhold code from the public, especially code so integral to the daily functioning of so many public institutions and businesses. In a cunning PR spin, it appears that Microsoft has started blaming the incident on third-party firms' access to kernel source and documentation. Translated out of Redmond-ese, the point they are trying to make amounts to "if only we'd been allowed to be more secretive, this wouldn't have happened...!"
We also need to see that calling for a diversity of providers of nonfree software that are mere front ends for "cloud" software doesn't solve the problem. Correcting it fully requires switching to free software that runs on the user's own computer.The Free Software Foundation is often accused of being utopian, but we are well aware that moving airlines, libraries, and every other institution affected by the CrowdStrike outage to free software is a tremendous undertaking. Given free software's distinct ethical advantage, not to mention the embarrassing damage control underway from both Microsoft and CrowdStrike, we think the move is a necessary one. The more public an institution, the more vitally it needs to be running free software.
For what it's worth, it's also vital to check the syntax of your configuration files. CrowdStrike engineers would do well to remember that one, next time.
As of our writing, we've been unable to ascertain just how much access to the Windows kernel source code Microsoft granted to CrowdStrike engineers. (For another thing, the root cause of the problem appears to have been an error in a configuration file.) But this being the free software movement, we could guarantee that all security engineers and all stakeholders could have equal access to the source code, proving the old adage that "with enough eyes, all bugs are shallow." There is no good reason to withhold code from the public, especially code so integral to the daily functioning of so many public institutions and businesses. In a cunning PR spin, it appears that Microsoft has started blaming the incident on third-party firms' access to kernel source and documentation. Translated out of Redmond-ese, the point they are trying to make amounts to "if only we'd been allowed to be more secretive, this wouldn't have happened...!"
We also need to see that calling for a diversity of providers of nonfree software that are mere front ends for "cloud" software doesn't solve the problem. Correcting it fully requires switching to free software that runs on the user's own computer.The Free Software Foundation is often accused of being utopian, but we are well aware that moving airlines, libraries, and every other institution affected by the CrowdStrike outage to free software is a tremendous undertaking. Given free software's distinct ethical advantage, not to mention the embarrassing damage control underway from both Microsoft and CrowdStrike, we think the move is a necessary one. The more public an institution, the more vitally it needs to be running free software.
For what it's worth, it's also vital to check the syntax of your configuration files. CrowdStrike engineers would do well to remember that one, next time.
This is stupid (Score:3, Interesting)
Re:This is stupid (Score:5, Insightful)
It's not about open or closed source. This was a process error. Allowing components of high availability systems to all update on the same day is a wrong design choice. It would be just as wrong under Linux.
Re:This is stupid (Score:4, Interesting)
Re:This is stupid (Score:5, Informative)
Microsoft pushes a Defender update most days, and may push more than one update in a day. Regardless of how many, they're bucketed into three groups: a small one that acts as a trial group, a large one that encompasses most users, and another small one that is used for systems where stability is a more important factor. Within those, there is some additional randomness for when each system gets it. All this happens after internal test rollouts that look for immediate crashes. In doing so, they avoid exactly what Crowdstrike did.
And Microsoft isn't the first to do this. This has been standard procedure for decades within the AV community. This was an enormous failure on the part of Crowdstrike.
Re: (Score:2)
Microsoft isn't the first to do this. This has been standard procedure for decades within the AV community.
There's a line from the movie Tron: "The standard, substandard training which will result in your eventual elimination."
Snakeoil Community (Score:2)
>> This has been standard procedure for decades within the AV community.
correction:
This has been standard procedure for decades within the Snakeoil Community.
Re: (Score:2)
Another factor is Microsoft update can be handled by the built-in startup recovery because the Windows OS knows there is a defender update or maybe driver update just applied today and know how to roll it back. Crowdstrike is doing its own thing without going through a standardized-to-all update mechanism. Thus the computer is unable to recover from boot failure automatically.
We need a standardized-to-all mechanism for software installation / update when such installation / update is capable of ruining a
Re: (Score:2)
You can't re-install the corrected update without rolling back first anyway.
WannaCry was widespread because there are too many computers NOT applying updates timely. Why do people don't update their computers timely? One of the reasons cited is fear of an OS update may break the computer from running formerly working applications. For organizations lacking dedicated resources and personnel to handle this kind of accidents, they are even more likely to refrain from timely updates. Another reason is OS upd
Re:This is stupid (Score:4, Insightful)
You are. And because these updates are pushed in real time and can lead to boot failure if broken, that is a very high-risk operation model. At the very least, testing should have caught what was obviously a really stupid error. It did not, which means it was completely inadequate. But worse, pushing such updates which can crash the machine (and there is zero need to do it that way) and prevent reboot is an extremely bad idea in the first place. Just think what happens if somebody compromises the Cloudstrike supply chain to its customers.
This is an abysmal failure on many levels, bot on the side of Cloudstrike and on the side of Microsoft:
1. Updates to configuration that cannot be blocked or delayed by customers, yet that can lead to the observed problems.
2. A kernel-level interface that can crash the machine. Microsoft provided nothing more adequate.
3. A software architecture by Cloudstrike that, after a crash just boots the same thing and crashes again. Proper risk analysis would have identified that risk. Proper risk management would have eliminated it.
4. Ridiculously inadequate testing by Cloudstrike.
5. A complete failure to do proper risk assessment and risk management, both by Cloudstrike and by Microsoft.
The conclusion here is that Cloudstrike does not do professional work, but neither does Microsoft and both should not be relied on for anything critical.
Re: (Score:2)
I don't think there's anything inherently wrong with it being "a single day" -- there are legitimate tradeoffs with security here that want unusually fast rollouts.
Even still, if this thing had been rolled out over even a couple of hours, even with the crudest of telemetry, it would have presumably been stopped when it only crashlooped a few percent of machines. That still would have been a huge annoyance but it wouldn't have been stop-the-business bad for the large majority of affected businesses.
Re: (Score:2)
Re:This is stupid (Score:5, Insightful)
Some more insight, then. Critical code (code running in kernel mode, code critical to starting up/maintaining the availability of a system) should not collapse because of an erroneous or missing external file unless that file is critical to the main function of the system and the system is better off not starting than starting without the contents of that file. I'm (happily) on the sidelines for this one, but so far the evidence I've seen suggests that CloudStrike fucked up big time, in multiple ways. If their intention was to provide a defense system with "near realtime" global distribution, then their internal controls were obviously not up to the job of supporting the kind of responsibility they took upon themselves. If, on the other hand, they fully advised their customers of the risk of having production servers all auto-update simultaneously, in real time, from what is to them (the customers) a foreign source and their customers all chose to take the "easy road" and assume that CloudStrike was infallible, then their customers have themselves to blame.
Lots of programmers take on writing critical code (device drivers, for example). Having this code fail because of an error in an external file is just sad.
I think that attempting to blame Microsoft is a red herring or an attempt to try and drag some bigger pockets into the blame pool in hopes of some future remuneration from those bigger pockets. Microsoft sucks for many reasons (including forced updates), but as far as I'm aware CrowdStrike had their big boy pants on and made this mistake all on their own. I don't think it's the OS vendor's duty to make sure that 3rd party software written to operate as critical code works unless the OS vendor makes a guarantee to that effect.
Re: (Score:3)
There is another element of this fiasco that seems to be overlooked: these systems had a drive-level encryption system without an enterprise management for the keys.
At my last job we had a management system that allowed control of the managed systems including pushing and deleting files in the encrypted file system. If we needed to push a group policy object to all the managed computers, the system could do that. If the systems needed to be booted into a recovery state, the management software had the enc
Re: (Score:2)
The "enterprise management system" was down due to the issue.
Even if you brought that up first, the other computers wouldn't be able to receive any group policy updates until they too were manually brought back online.
Re: (Score:2)
When I got a Windows 7 laptop the very first thing I did was to back up the drivers and extract all the serial numbers/keys etc. I then went through the BIOS settings, formatted the disk, and re-installed from scratch. This gets rid of any "shovelware" and gives you an install in a known state.
Plus you can put a Linux OS on there whilst you're at it :)
Re:This is stupid (Score:4, Interesting)
Having this code fail because of an error in an external file is just sad.
FWIW, in this case the external file contains P-code which then gets to run in kernel context.
I think that attempting to blame Microsoft is a red herring or an attempt to try and drag some bigger pockets into the blame pool in hopes of some future remuneration from those bigger pockets.
No, there is sufficient reason to also blame Microsoft. Tools like CrowdStrike must run in Kernal mode in part because Microsoft doesn’t really give them a lot of other options. Back when NT 3.1 was being developed, Microsoft made the decision to only support Ring 0 and Ring 3 (Kernal mode/user mode respectively) for performance reasons — switching between rings can take 150+ clock cycles, and can be slow. But the Intel CPU supports four Rings of execution, with Ring 1 intended for device drivers.
Modern Windows works in this way to this day. Had Microsoft been more focussed on safety and less on raw performance, drivers could run in Ring 1 and could be isolated from the kernel. A Ring 1 CrowdStrike Falcon Sensor driver could, in theory, be isolated from the system when it misbehaved, allowing the system to remain online. But Microsoft being Microsoft, they chased performance over safety — so we have a situation where an errant driver like the Falcon sensor can bring the whole system down.
If you want to see a system that does it right, look at how Falcon Sensor runs on macOS. There the Falcon Sensor is written as a modern System Extension [apple.com], and leverages DriverKit Endpoint Security extensions — where it has all the access it needs to system events, but runs completely in user mode. Should CrowdStrike on macOS run into a similar problem, the system can just isolate it and shut it down without crashing the entire system like Windows.
What the FSF is failing to say here is that Linux has the same basic flaws that Windows has when it comes to misbehaving drivers. Linux also only supports Ring 0/Ring 3, and doesn’t provide a way for something like the Falcon Sensor to run in user mode ala macOS. Indeed, certain Linux distros with certain kernel revisions have already had kernel panics due to CrowdStrike earlier this year [theregister.com].
You can’t wait a week for your security software to be updated when there are actors online active exploiting zero-day vulnerabilities. CrowdStrike absolutely screwed the pooch on this one. But both Microsoft and Linux still assume we live in the device driver works of the 1990s, where you release a driver and maybe just do a few bug fixes every few months, and which eventually becomes stable enough not to change. In the 202X online world we need both security software that is constantly updated and appropriate driver protection guarantees to simply disable misbehaving divers like this one. Unfortunately, the only major OS doing any work in this area seems to be Apple — Linux could learn something from them in this regard. Maybe instead of claiming that being able to choose from multiple OS vendors using the same kernel is the solution the FSF could instead work with the Linux Kernel maintainers to look at mechanisms to isolate drivers, so when they misbehave they don’t take down the entire system with them.
Yaz
Re: (Score:2, Informative)
Linux also only supports Ring 0/Ring 3, and doesnâ(TM)t provide a way for something like the Falcon Sensor to run in user mode ala macOS
You can do an on access scanner in user space with fanotify, which can block file access until scanning is done.
Re: (Score:2)
I’ll admit I’ve never done development with fanotify, so I’m open to being corrected here.
From what I understand of fanotify, it’s well suited for something like a virus scanner — but what CrowdStrike Falcon Sensor does is much more than file-level scanning. It’s also doing in-memory checks, and looks for patterns of events that may indicate malicious activity.
Indeed, the P-code file that killed Windows instances the other week was intended to check for certain types of
Re: (Score:2, Troll)
It means that at least some of what it's doing could be done in user space. They could use kmemleak for detecting kernel memory leakage instead of doing it themselves, while process memory is available to any process with permissions through /proc. On linux, named pipes can be monitored from user space [ycombinator.com].
Re: (Score:2)
when my entirety what?
If you're so well-educated, how's about you explain what it does that can't be done by these mechanisms? Because it looks like you know less than nothing yourself so far.
Re: (Score:2)
Re: (Score:3)
The moment your boss, you know the guy that typically approves the checks puts security over speed then we can blame Microsoft. They are simply providing the product the business's want....performance over security.
Re: (Score:3)
You understand correctly. I'm telling you that's a gross design error. 100% wrong.
Your criticality of you systems also represents a criticality against external attackers. The whole point of this class of update is that it gets pushed out on day one to prevent zero day attacks. It's not a design error to close security gaps as soon as they arise.
A better question was why the bluescreen happened in the first place. Why was the software able to do this. How was it that a seeming configuration / definition update able to cause this to occur.
Don't throw the baby out with the bathwater.
Re: (Score:2)
It's not a design error to close security gaps as soon as they arise.
It can be and in the Crowdstrike incident, it clearly was. Security is a cost equation. Threat times vulnerability times incident cost.
Pushing a bad update was a small but non-zero threat. Times a 100% vulnerability times an exceptionally high incident cost.
Pushing updates slowly is also a vulnerability but with an appropriate defense in depth, the vulnerability is much much less than 100%. There's also a threat and it's higher than the bad update threat, but not enough higher to offset the reduced vulnerab
Re: (Score:2)
It can be and in the Crowdstrike incident, it clearly was.
Everything can be when done incorrectly. The point is to address the incorrect use of this kind of update, not to throw out a fundamental fast acting security practice because one company with a history of fuckups fucked up again.
Maybe it would be better if we stop calling this "update". Just like fetching the current spam list for Spamassassin isn't an update, or getting virus definition files from windows defender isn't an update. The point is there are parts of security that you want to be automated and
Re: (Score:2)
It's not about open or closed source. This was a process error. Allowing components of high availability systems to all update on the same day is a wrong design choice. It would be just as wrong under Linux.
I deal with troubleshooting systems. The present system is cloud storage of critical elements, and Microsoft allowing third parties to determine whether their ecosystem works.
Analysis - the bad guys have found a great way to bring the cloud down. If Crowdstrike has access to the OS in a manner that one problem can bring down the house of cards, you can bet the bad guys can have the same. Fatal security problem.
The cloud, once promoted as the path forward for modern computing. Completely secure, and s
Re: (Score:3)
You may be conflating two different issues. The Microsoft cloud event and the Crowdstrike event happened within hours of each other, but were unrelated.
Re: (Score:2)
Indeed. Also think what a nice juicy attack target Cloudstrike makes.
Re: (Score:2)
Indeed. Also think what a nice juicy attack target Cloudstrike makes.
If people aren't worried about this, they should be.
Re: (Score:2)
A lot of eggs in one rather flimsy basket. Very stupid.
Re: (Score:2)
Yes. The argument the FSF is making here is also that Linux is a lot less of a monoculture than Windows is. And that is a fair point.
Re:This is stupid (Score:4, Insightful)
The FSF makes a lot of arguments why open source can be a better choice. The ones here are not particularly on-point for the Crowdstrike outage. The same architectural errors in system design apply to both open and closed source. As past failures at Facebook and Amazon have shown us.
Re: (Score:2)
You're playing a game and your computer crashes. You don't blame the game.
What planet do you live on? The game is the top candidate cause for the crash until it's demonstrated to happen under other circumstances. The only way an OS can perfectly prevent it is to emulate *everything* so that only the emulated machine fails. That's a computational expense far beyond most folks' tolerance.
Re: (Score:3, Insightful)
a single operating system made by a single stupefyingly predatory monopoly in Redmond, Washington
This was an issue caused by Cloudstrike, not Microsoft. Many Windows systems did not use Cloudstrike. A similar product for Linux, FOSS or not, would have created a similar outage.
Re: (Score:2, Insightful)
Re: (Score:3)
Re: (Score:2)
It would, and in fact it did, about a month ago.
Re: This is stupid (Score:2)
Re: This is stupid (Score:2)
Re: (Score:2)
Not really - for large enterprises finding out something is going on quickly is as/more important than preventing initial foot holds in the first place.
The simple answer is the regulators forced Microsoft of allow a loop hole big enough to drive a system crippling truck thru their WOL program. Driver code should not be doing anything with data from users space other than flipping a few options on and off based on some very very carefully defined structures sent across a restricted interface. They certain
Linux too (Score:5, Interesting)
Crowdstrike had already caused crashes in Linux too. https://www.theregister.com/20... [theregister.com]
Why would their software require kernel level access on Linux?
Re: (Score:3)
No need to feel jealous, Linux has reached feature parity with Windows :)
https://linux.slashdot.org/sto... [slashdot.org]
Not much need for clownstrike snakeoil (Score:2)
The difference is: Linux has not such a pressing, vital need for this clownstrike snake-oil than Linux.
Re: (Score:2)
Why would their software require kernel level access on Linux?
Same reason they need it on Windows. To provide functionality not offered by an OS API.
Re: (Score:2)
Linux has hooks to do what they need without running in the kernel. Apparently they're not interested in doing it the right way, but simply port their Windows garbage over.
Re: (Score:2)
Why would anyone in their right mind even run Anti-Virus software on their Linux box?
FSF at the airport (Score:5, Funny)
I can't help imagining an airport gate where the software is acting up and two admins get into a vi vs. emacs argument while trying to fix it.
Re: (Score:2)
I can't help imagining an airport gate where the software is acting up and two admins get into a vi vs. emacs argument while trying to fix it.
I had one of Her Majesty's finest border drones insist to me that I could not move gate and had to try rescanning my passport while the screen was showing the BIOS, boot sequence then Windows XP desktop.
Eventually he conceded that "my passport didn't work" after 15 minutes and allowed me to try a different gate. He made sure the next person put in their fruitless 15 m
Re: (Score:2)
I can't imagine this. An airline paying for two IT people? That sounds like a cost cut waiting to happen.
Re: (Score:2)
It would all be a foolish argument anyhow. We all know that Pico is better.
Re: (Score:2)
Nano (from Pico) FTW. :P
We've tried that too (Score:4, Insightful)
Instead, we can imagine a more horizontal structure, where this airline and this public library are using different versions of GNU/Linux, each with their own security teams and on different versions of the Linux(-libre) kernel...
...because that works SO well for mobile phones.
And it's going to work even worse for education. Realistically, a public library is lucky to have *one* person who *kinda* understands their computers. The odds they have a security *team* are essentially zero. Ditto public schools.
An airline or a major university? Sure, maybe. But a library?
Crowdstrike? (Score:3)
Re: Crowdstrike? (Score:2)
Re: Crowdstrike? (Score:2)
Re: (Score:2)
They may literally be dead sooner or later, due to the lawsuits coming out of this event. They can't escape liability with an EULA if the cause of the outage was negligence.
Re: (Score:2)
Their liability is limited even if it was negligence. Remember you're responsible for the BSOD, not how the business responds to it. The lack of a business continuity strategy is not in their control meaning a large portion of damages are out of their hands.
It'll hurt them a bit (I hope) let's lets not kid ourselves, they won't be dead as a result of this.
Re: (Score:2)
If they did have business continuity insurance then the insurers will be looking to recover their losses.
Check Lawful Masses on YouTube, he did a video about this. He's an actual lawyer, unlike me who just plays one on Slashdot.
Re: (Score:2)
This company is dead.
A company carrying a valuation of USD60Bn is "dead".
OK, then....
Re: (Score:2)
When you’re more or less known for the work you do in one particular area, and you not only failed at your basic task of keeping computers up, accessible, and secure, you failed so badly that you caused the largest IT outage in history, then yeah, it isn’t unreasonable for people to declare them dead.
That market cap you’re citing? It’s down roughly 35% from where it was before the event. Not exactly the mark of a healthy company.
This event is now shining light on past, similar events
Re: (Score:2)
When you’re more or less known for the work you do in one particular area, and you not only failed at your basic task of keeping computers up, accessible, and secure, you failed so badly that you caused the largest IT outage in history, then yeah, it isn’t unreasonable for people to declare them dead.
Oh! I guess that's why Equifax is completely dead after their 2017 breach! Go check their stock - momentary drop, and back to normal within 1-2 years.
Re: (Score:2)
The breach affected none of their customers, so why would their business tank? That was an unsurprising result.
Re: (Score:2)
How many people impacted by Crowdstrike were paying customers? Wanna make a friendly wager on whether or not Crowdstrike is dead / will be around in a similar capacity in 5 years? :-)
Security by obscurity (Score:2, Interesting)
Is more likely to hide the problem from a future victim of an exploit than from the author of such an exploit.
Closed source just means that you can't check what's under the hood and have to trust that megacorp that's been in court a bunch of times for unethical behavior to be ethical.
This does not seem like a great bet to me.
I hate MS as much as the next guy, but... (Score:4, Interesting)
Let's be fair. This isn't on MS. It's on Crowdstrike. And don't forget, Crowdstrike wound up kernel panicking Linux a few months back.
Re: (Score:2)
MS set the tone and culture, provided the opportunity and set things up. Cloudstrike merely triggered the event after that.
Also note that MS Windows is so insecure that you need things like Cloudstrike.
Re: (Score:2)
Re: (Score:2)
Indeed. That is why Microsoft almost panicky tries to blame the EU. They are deeply afraid that too many people will notice that they share a lot of the blame here and that their products are actually pretty bad.
Re: (Score:2)
"Let's be fair. This isn't on MS. It's on Crowdstrike. And don't forget, Crowdstrike wound up kernel panicking Linux a few months back."
True, and frankly it's stupid to allow software like this on your system but often we're forced to by 'policy'. It's frustrating running Linux when policy makers insist on treating it like Windows and making us install all this anti-malware code which requires deep access to the system.
However, although might be unpopular here, Apple is unaffected because they won't let too
Re: (Score:3)
So when Crowdstrike did the same thing on Debian a while back, that was on Debian? The only reason that one didn't cause as widespread damage is because Debian is a relatively minor player in the enterprise space (I mean from a server/container standpoint).
Re: (Score:2)
The difference is, on Debian (or any Linux), they can solve the problem without running in the kernel. That they don't is on Crowdstrike.
On Windows there is no choice. And that's on Microsoft.
Re: (Score:2)
On Linux, at least in kubernetes clusters, Crowdstrike uses eBPF to avoid running directly in the kernel. On Windows, Microsoft also implemented eBPF for Windows (https://github.com/microsoft/ebpf-for-windows) but CrowdStrike does not to use it.
Re: (Score:2)
There is a kernel version for Linux as well. That caused crashes a few months ago. But it's not needed to do the job in Linux.
The Windows eBPF implementation is not complete. Also, Windows is a lot more API heavy than Linux, so it may well not be enough.
Re: (Score:2)
Why had all those businesses stopped doing their own update testing?
They had not. This was a configuration update and, with Microsoft's blessing, those do not need to be blockable. With Cloudstrike, you can only block code updates to do your own testing. Config updates go right through and, as we have seen, can make a system unbootable. An exceptionally stupid design.
Re: (Score:2)
Why had all those businesses stopped doing their own update testing?
You don't test definition updates. You push them out urgently. Almost no one ever tested this kind of thing so there was nothing for them to stop doing. The point here was that this update shouldn't have had the ability to cause a BSOD in the first place.
Resilience has been compromised in the name of security.
Security breaches are far worse than a drop in resilience. The overwhelming majority of companies came out the other end unaffected. A few airlines made some losses, but for the most part it was just a weekend of headaches for IT.
Re: (Score:2)
Exactly. That is a point that flies right over the heads of the Microsoft-apologists though.
IPv4-only ISPs are the problem (Score:3)
one wonders how wise it is for so many critical services around the world to hedge their bets on a single distribution of a single operating system made by a single stupefyingly predatory monopoly in Redmond, Washington.
But enough about the Nintendo Switch system software; let's talk about what it'd take to switch to self-hosted free software.
calling for a diversity of providers of nonfree software that are mere front ends for "cloud" software doesn't solve the problem. Correcting it fully requires switching to free software that runs on the user's own computer.
And this in turn requires more investment in IPv6 rollout among Internet access providers. It's 2024, and Frontier fiber service is still IPv4-only. Some other ISPs provide "dual-stack lite" service with full IPv6 alongside limited IPv4 that allows only outgoing connections. Because there aren't enough IPv4 addresses to go around, a whole neighborhood gets put behind one IPv4 address using carrier-grade network address translation (CGNAT). This situation makes it impractical for a customer of an ISP that uses dual-stack lite to run an on-premises server, as Frontier subscribers have no way to connect to it.
Dubious advice (Score:2)
The correct answer is to not allow third-party kernel extensions on your cloud infrastructure. Insist on the OS vendor providing user-space hooks for whatever you need to do, and if they won't, choose another OS vendor. This wasn't caused by "the cloud" or "closed-source software". It was caused by an architectural decision when they wrote the Crowdstrike Falcon kernel extension for Windows.
I've been told that had this happened on the Linux version of crowdstrike, the OS as a whole would have stayed up,
Re: (Score:2)
I've been told that had this happened on the Linux version of crowdstrike, the OS as a whole would have stayed up, because it is based on eBPF instead of being a kernel extension. And I think that the Mac version is also not running in the kernel. So out of the three architectures, only the Windows port had this design problem.
I don't know much about the "Debian Event" but it actually seems to be pretty much identical from the reports:
"In April, a CrowdStrike update caused all Debian Linux servers in a civic tech lab to crash simultaneously and refuse to boot. The update proved incompatible with the latest stable version of Debian, despite the specific Linux configuration being supposedly supported. The lab's IT team discovered that removing CrowdStrike allowed the machines to boot and reported the incident." - https://www.neowin [neowin.net]
Re: Dubious advice (Score:2)
So the OS is up and running, but the end user canâ(TM)t get any work done. No difference.
Re: (Score:2)
I'm not certain why you would be replying to me about MacOS, but since we're here, I'm unaware of anything magical in the various BSDs that would prevent a module running in the context of the kernel from borking the whole works there the same as any other popular platform.
Re: (Score:2)
I'm not certain why you would be replying to me about MacOS, but since we're here, I'm unaware of anything magical in the various BSDs that would prevent a module running in the context of the kernel from borking the whole works there the same as any other popular platform.
Apple's rules about not allowing non-driver kexts for pretty much any reason would be the "anything magical". Anything like Crowdstrike Falcon on macOS would almost certainly have to run in user space. So if it crashes, it crashes, and the rest of the OS should just roll its eyes, assuming nothing in the kernel ends up blocked waiting for some kind of permission ack from Falcon (which it shouldn't if their daemon isn't running).
Greg doesn't sound informed (Score:2)
"one wonders how wise it is for so many critical services around the world to hedge their bets on a single distribution of a single operating system made by a single stupefyingly predatory monopoly in Redmond, Washington"
Greg is conflating CS with Windows...
Greg is arguing that thousands of divergent systems will be more secure, in the face of all the evidence to the contrarty.
Greg should not be talking about anything technical. Ever. Greg should be fetching coffee.
Itâ(TM)s anti-virus software (Score:2)
The problem is that the update took millions of machines out that had to be fixed manually. There are several reasons for this:
1. They used a parser that can crash with the wrong input, and I bet they still use it. This needs a manual review plus fixes to make sure that whatever the input is, the parser will survive parsing it, and accept it or reject it. From e
wrong tree (Score:3)
This isn't about free vs. proprietary software.
It's about us security dudes telling people to patch everything immediately, people doing the logical thing - automate it - and then someone fucking it up and look there everyone is down.
Under all of that, it's about trust. Do we trust companies like Crowdstrike with kernel-level access to our systems? Do we trust them with updates? Do we trust them enough to make fully automatic updates?
Re: (Score:2)
Do we trust companies like Crowdstrike with kernel-level access to our systems?
Wrong question.
Do we trust them more than the end user who is desperately trying to let the threat actors in?
Yes. The answer is yes.
This type of approach to securing endpoints exists not because of the threat actor. It exists because end users are so horrendously bad at behaving responsibly. All the time. With carefree abandon.
The one truism about improving safety as applied to automobiles. The safer the driver feels, the more risks they take.
What we need is an AGI that stabs the end user with a pointy st
Which tree? (Score:2)
Safety, security and reliability are three separate problems. People do tend to muddle them together. I find myself always bashing management with this issue.
- Security is about equipment protecting itself against people. The Internet being the cesspool that it is has a lot of this.
- Safety is about equipment protecting people against itself. This always has highest priority. But is also only of concern in a limited set of environments. Life threatening ones. Usually because of risky mechanical or lo
Re: (Score:2)
Do we trust them more than the end user who is desperately trying to let the threat actors in?
That's a bit of hyperbole and part of the problem. Users are that - users. They want to USE computers to do their actual jobs. Their actual jobs are not computers. They behave what you call irresponsibly when things get in their way, because guess what, on the other side of security is not user stupidity, it's the boss pressing his people to deliver faster and better. We've been through so many decades of "optimizing performance" that now the small hurdles of security are major obstacles.
re "optimizing perf
Re: (Score:2)
They behave what you call irresponsibly when things get in their way, because guess what, on the other side of security is not user stupidity, it's the boss pressing his people to deliver faster and better. .
Yeah, no. The machine is not clicking on every link they received in unsolicited mail. The boss is not forcing them to download infectious garbage based on web pop-ups, boredom and general idiocy. Productivity is not driving them to browse garbage, infected, sites.
Phishing remains so prevalent because it works.
End users are the primary entry point and will remain so until there are consequences for their actions. Sadly, HR will not engage at this level because - let's be blunt - HR are some of the worst o
Re: (Score:2)
Phishing remains so prevalent because it works.
We agree on that.
The boss is not forcing them to download infectious garbage based on web pop-ups, boredom and general idiocy.
No, but turning them into constantly under pressure cogs in a machine makes two things certain: One, they don't give a shit about the company and two, they don't have time to lean back and reflect on what they're doing.
People will find opportunities to rest. It is well studied that the max most people can focus on a task is about 90 minutes. After that, you need a break or your concentration nose-dives. People take those breaks. If they can't do it officially, they'll have long restroom vis
Re: (Score:2)
Ah, the usual. Putting more responsibility on the end user. With awareness campaigns and punishments. Because it has done fuck all for the past 30 years, so it'll certainly start working... any day now... any day...
Can we stop pretending the mouth-breathing farkwits have no personal agency here and are *entirely* reponsbile for their actions?
"Oh, shame. The poor dullards are too tired and dumb to be held accountable."
WTH argument is this?
Re: (Score:2)
Personally, as mainly a cyclist and motorcylist, I *really* like the idea of the "Tullock Spike":
"This is from legendary economist Gordon Tullock’s famous comment: “If the government wanted people to drive safely, they’d mandate a spike in the middle of each steering wheel".
The Tullock Spike, or, to sound more gearhead-friendly, the Tullock Steering Column was something first thought of by the noted economist Gordon Tullock. Tullock came up with the idea around the time seatbelts in cars w
FSF is ignoring the possibility of malice (Score:2)
I am not saying that the ClownStrike debacle was malicious but they are is a great position for $EvilActor to get his stooge installed. This stooge "accidentally" breaks any QA procedures and a "buggy" update is sent out world wide causing mayhem at a time of $EvilActor's choosing. This time the effects were relatively benign -- although a pain and costly for those affected. But what if this had happened when $EvilActor was invading a neighbouring country or doing a large drug run or ... ?
ClownStrike update
Childishness is not a solution (Score:2)
That wasn't the cause of the problem... (Score:2)
The problem had nothing to do with open vs closed source, it was that they didn't sufficiently test the update, compounded by them releasing it globally at the same time. They should have of course tested more thoroughly, and then released in a "canary release" model, so when it crashed 0.1% of their customers they could have detected that and stopped the update before it crashed all their Windows customers.
Re: (Score:2)
Microsoft doesn't want to do any of that, regulatory bodies forced them to.
Re: (Score:2)
Bullshit. Microsoft was forced to provide "equal access", not to provide "insecure access". Moving their own stuff out of the kernel (where it has no business being anyways, just as the Cloudstrike crap) and opening the API would have worked just as well to comply with anti-trust regulations. But would have caused none of the vulnerabilities.
Re: (Score:2)
The exact same CrowdStrike tools exist on Linux and macOS. In fact, they *only* support Linux when it comes to containers, not Windows.
Re: (Score:2)
Actually, the bad idea was flagging the driver as boot-start - ie, no boot without....
OS would have been fine without the driver.
Re: Can we just agree ... (Score:2)