On 21/7/24 06:38, The Wanderer wrote:
The first would be poor institutional practice; the others would be potentially-questionable software design, although it's hard to know without seeing the internal architecture of the software in question and understanding*why* it's designed that way. In either case, it's not obvious to me why decapitating a few scapegoats would*improve* the situation going forward, unless it can be determined that specific people were actually negligent.
The CrowdStrike outage emulated the very thing it is alleged to protect against - a zero day exploit.
The difference is CrowdStrike has a far better distribution mechanism as all its victims willingly accepted it being put on their machines and willingly accepted automatic updates, each of which potentially could cause a failure.
Given the time delays in recovery and in many organisations reports of people having to drive to physical locations to reset machines there was clearly no effective mitigation or recovery plans in place.
There are ways to mitigate a zero day exploit such as Out-of-Band Management (OOBM) or Baseboard Management Controller (BMC) so at least the system can be recovered, at least remotely, and likely automatically. Alternatively services can run virtually and can be reset automatically by monitoring systems.
There is also the system design issues that even if the majority of systems are immune, key system failures will take down a network. Active Directory servers seem a particularly weak point.
So my point still stands. Those responsible for mitigation of faults/zero day exploits in many cases were negligent in their system and process design. Specifically they did not install hardware and software that could be remotely and automatically managed out of band and they provided essential services such as Active Directory on vulnerable hosts with often no easy way to recover them.
On a second level I do have to ask if CrowdStrike and equivalent reactive monitoring systems actually provide value? Yes, they reduce the time a zero day exploit has to be effective, but you have to assume there *will* be a serious exploit and you *will* lose functionality and/or data. Focusing on resilience of service, hardening of software, and management of data that even if stolen is of no value seems to be more useful.