On Thu, 27 Mar 2025, Corinna Vinschen wrote: > On Mar 27 10:26, Jeremy Drake via Cygwin-patches wrote: > > comment, it seems 8.0 is the odd-version-out here. > > Yeah, but we don't support 8.0 anymore, only 8.1. > > > BTW, something I would *like* to do but haven't figured out how to > > accomplish cleanly yet is to follow the registers. What I mean by this is > > illustrated by what I did in the aarch64 version: I could find the call to > > RtlEnterCrticalSection, then work backwards, find the add whose Rd was x0 > > (the register for the first (pointer) parameter in the calling > > convention), then find the adrp whose Rd was the Rn of the add. What I > > would do on x86_64 is find the call to RtlEnterCriticalSection, find any > > mov rcx, <reg> before, then find the lea <reg>, [rip+XXX] (where reg would > > be rcx if there wasn't a mov rcx after the lea). Unfortunately, the > > variable length-ness doesn't lend itself to iterating backwards, so I am > > not confirming that the lea actually ends up in rcx for the function call. > > The only register correlation I do is that the register used in the > > mov <reg>, QWORD PTR [rip+XXX] is then used in the next instruction that > > must be test <reg>, <reg>. The old code required that <reg> to be rbx, > > but I don't see any reason that rbx is required... > > Yeah, reading x86_64 backwards will lead to confusion. And no, rbx > isn't required, any non-volatile register could do it. It seems that > rbx is used because of the way vc++ allocates register.
After taking out the windows 8.0 case, I think this should be doable: * when finding the lea that we're already looking for, save the destination register * if the destination register is not rcx, look for a 64-bit mov into rcx from <reg> (where <reg> is the register from the lea) before the call to RtlEnterCriticalSection This won't catch cases where they shuffle it between multiple registers, or otherwise obfusate the load into rcx (push/pop, xchg, using some memory location, ...) but I think this covers every case I've seen (including those mentioned in comments about preview builds). It would also allow us to skip the theoretical-but-legal sequence (intel) lea rXX, [rip+XXXX] ; FastPebLock ... call UnrelatedFunction mov rcx, rXX call RtlEnterCriticalSection mov rYY, QWORD PTR [rip+YYYY] ; RtlpCurDirRef test rYY, rYY ... I'll try to find some time to test this latest round on as many released Windows versions >= 8.1 as I can, and then send a v3 series. It works on 22631 at least.