On 2024-07-20 at 09:19, jeremy ardley wrote: > On 20/7/24 18:35, George at Clug wrote: > >> On Saturday, 20-07-2024 at 13:54 hlyg wrote: >> >>> crowdstrike makes news headlines, many Windows become blue >>> screens >> >> The CrowdStrike issue was not a Windows issue, it was a CrowdStrike >> issue. >> >> The problem did not affect our Windows computers as we have not >> installed CrowdStrike software. >> >> I think the media have a habit of over exaggerating things. > > The problem was not CrowdStrike as such. It happens in the best of > operations. > > The problem is the Windows Systems Administrators who contracted for > / allowed unattended remote updates of kernel drivers on live > hardware systems. This is the height of folly and there is no > recovery if it causes a BSOD.
Speaking as someone who administers (part of) a CrowdStrike Falcon deployment at my workplace, although I was not involved in selecting it and would not be able to decide to switch to something else: I do not believe this is a fair description of what happened. CrowdStrike Falcon does not manage kernel drivers in general. It manages its own locally-installed client, which happens to include some kernel-level drivers. The update in this case does not appear to have actually modified any of those drivers; it appears to have added a new data file for use by such a driver, and those data files appear to be misleadingly named in such a way that they look like drivers. (I have not confirmed that personally yet, although I have access to the files in question and intend to do so, but people who are more familiar with Windows drivers than I am have stated that the files in question do not comport with the binary file format used by Windows driver files.) All the sysadmins involved did is agree to let an antivirus-equivalent utility update itself, and its definitions. I would be surprised if this could not have easily happened with *any* antivirus-type utility which has self-update capability; I'm fairly sure all modern broad-spectrum antivirus-etc. suites on Windows do kernel-level access in similar fashion. CrowdStrike just happens to be the company involved when it *did* happen. That the sysadmins decided to deploy CrowdStrike does not make it reasonable to fault them for this consequence, any more than e.g. if a gamer decided to install a game, and then the game required a patch to let them keep playing, and that patch silently included new/updated DRM which installed a driver which broke the system (as I recall some past DRM implementations have reportedly done), it would then be reasonable to fault the gamer. In neither case was the consequence foreseeable from the decision. > The situation is recoverable if all the windows machines are virtual > with a good backup/restore plan. The situation is not recoverable if > the kernel updates are on raw iron running Windows. The situation is trivially recoverable if you can get access to the machine in a way which lets you either boot to safe mode and get local-administrator access, or lets you boot an alternative environment (e.g. live-boot media) from which you can read and write to the hard drive. I've spent a fair chunk of my workday today going around to affected computers and performing a variant of the latter process. Once you've done that, the fix is simple: delete, or move out of the way, a single file whose name claims that it's a driver. With that file gone, you can reboot, and Windows will come up normally without the bluescreen. > Heads should roll but obviously won't What good would decapitation do, here? At most, CrowdStrike's people are guilty of rolling out an insufficiently-tested update, or of designing a system such that it's too easy for an update to break things in this way, or that it's possible to break things in this way not with an actual new client version (which goes through a release cascade, with each organization deciding which of the most recent three versions each of their computers will get) but just with a data-files update (which, as we have seen here, appears to go out to all clients regardless of version). The first would be poor institutional practice; the others would be potentially-questionable software design, although it's hard to know without seeing the internal architecture of the software in question and understanding *why* it's designed that way. In either case, it's not obvious to me why decapitating a few scapegoats would *improve* the situation going forward, unless it can be determined that specific people were actually negligent. -- The Wanderer The reasonable man adapts himself to the world; the unreasonable one persists in trying to adapt the world to himself. Therefore all progress depends on the unreasonable man. -- George Bernard Shaw
signature.asc
Description: OpenPGP digital signature