On Sunday, 21-07-2024 at 08:38 The Wanderer wrote:
> On 2024-07-20 at 09:19, jeremy ardley wrote:
> 
> > On 20/7/24 18:35, George at Clug wrote:
> > 
> >> On Saturday, 20-07-2024 at 13:54 hlyg wrote:
> >> 
> >>> crowdstrike makes news headlines, many Windows become blue
> >>> screens
> >> 
> >> The CrowdStrike issue was not a Windows issue, it was a CrowdStrike
> >> issue.
> >> 
> >> The problem did not affect our Windows computers as we have not
> >> installed CrowdStrike software.
> >> 
> >> I think the media have a habit of over exaggerating things.
> > 
> > The problem was not CrowdStrike as such. It happens in the best of
> > operations.
> > 
> > The problem is the Windows Systems Administrators who contracted for
> > / allowed unattended remote updates of kernel drivers on live
> > hardware systems. This is the height of folly and there is no
> > recovery if it causes a BSOD.
> 
> Speaking as someone who administers (part of) a CrowdStrike Falcon
> deployment at my workplace, although I was not involved in selecting it
> and would not be able to decide to switch to something else: I do not
> believe this is a fair description of what happened.
> 
> CrowdStrike Falcon does not manage kernel drivers in general. It manages
> its own locally-installed client, which happens to include some
> kernel-level drivers. The update in this case does not appear to have
> actually modified any of those drivers; it appears to have added a new
> data file for use by such a driver, and those data files appear to be
> misleadingly named in such a way that they look like drivers.
> 
> (I have not confirmed that personally yet, although I have access to the
> files in question and intend to do so, but people who are more familiar
> with Windows drivers than I am have stated that the files in question do
> not comport with the binary file format used by Windows driver files.)
> 
> All the sysadmins involved did is agree to let an antivirus-equivalent
> utility update itself, and its definitions. I would be surprised if this
> could not have easily happened with *any* antivirus-type utility which
> has self-update capability; I'm fairly sure all modern broad-spectrum
> antivirus-etc. suites on Windows do kernel-level access in similar
> fashion. CrowdStrike just happens to be the company involved when it
> *did* happen.
> 
> That the sysadmins decided to deploy CrowdStrike does not make it
> reasonable to fault them for this consequence, any more than e.g. if a
> gamer decided to install a game, and then the game required a patch to
> let them keep playing, and that patch silently included new/updated DRM
> which installed a driver which broke the system (as I recall some past
> DRM implementations have reportedly done), it would then be reasonable
> to fault the gamer. In neither case was the consequence foreseeable from
> the decision.
> 
> > The situation is recoverable if all the windows machines are virtual
> > with a good backup/restore plan. The situation is not recoverable if
> > the kernel updates are on raw iron running Windows.
> 
> The situation is trivially recoverable if you can get access to the
> machine in a way which lets you either boot to safe mode and get
> local-administrator access, or lets you boot an alternative environment
> (e.g. live-boot media) from which you can read and write to the hard
> drive.
> 
> I've spent a fair chunk of my workday today going around to affected
> computers and performing a variant of the latter process.
> 
> Once you've done that, the fix is simple: delete, or move out of the
> way, a single file whose name claims that it's a driver. With that file
> gone, you can reboot, and Windows will come up normally without the
> bluescreen.
> 
> > Heads should roll but obviously won't
> 
> What good would decapitation do, here? At most, CrowdStrike's people are
> guilty of rolling out an insufficiently-tested update, or of designing a
> system such that it's too easy for an update to break things in this
> way, or that it's possible to break things in this way not with an
> actual new client version (which goes through a release cascade, with
> each organization deciding which of the most recent three versions each
> of their computers will get) but just with a data-files update (which,
> as we have seen here, appears to go out to all clients regardless of
> version).
> 
> The first would be poor institutional practice; the others would be
> potentially-questionable software design, although it's hard to know
> without seeing the internal architecture of the software in question and
> understanding *why* it's designed that way.
> 
> In either case, it's not obvious to me why decapitating a few scapegoats
> would *improve* the situation going forward, unless it can be determined
> that specific people were actually negligent.

Thanks Wanderer,

Please no 'decapitating', or I would have lost my head many years ago, and 
often (if that is possible).

Testing is important. Like 'backup and restore verification', often considered 
insufficient in hindsight after an incident, but rarely considered insufficient 
before the incident. 

Even with our best testing, we all make mistakes from time to time, and I have 
made my fair share.

My aim is not to blame, but it is necessary to identify the cause and to 
carefully consider how to mitigate further occurrences. 

Over reaction is not good - one decision might be not to use anti-virus 
software, which would mitigate the issue of anti-virus software bugs causing 
outages, but that would be far worse a solution than an occasional and rare 
outage.

And as for testing, testing IS necessary, but it will only ever be testing, 1) 
it is not possible to test for everything, 2) over testing can cause issues 
too, while still not capturing all potential issues.

I want to thank all the people from CrowdStrike and all the people applying the 
fix patches, thanks for quickly restoring services.  Keep up the great work of 
protecting our Internet services.

George.



> 
> -- 
>    The Wanderer
> 
> The reasonable man adapts himself to the world; the unreasonable one
> persists in trying to adapt the world to himself. Therefore all
> progress depends on the unreasonable man.         -- George Bernard Shaw
> 
> 

Reply via email to