On Mar 27 17:52, Jeremy Drake via Cygwin-patches wrote: > On Thu, 27 Mar 2025, Corinna Vinschen wrote: > > > On Mar 27 10:26, Jeremy Drake via Cygwin-patches wrote: > > > comment, it seems 8.0 is the odd-version-out here. > > > > Yeah, but we don't support 8.0 anymore, only 8.1. > > > > > BTW, something I would *like* to do but haven't figured out how to > > > accomplish cleanly yet is to follow the registers. What I mean by this is > > > illustrated by what I did in the aarch64 version: I could find the call to > > > RtlEnterCrticalSection, then work backwards, find the add whose Rd was x0 > > > (the register for the first (pointer) parameter in the calling > > > convention), then find the adrp whose Rd was the Rn of the add. What I > > > would do on x86_64 is find the call to RtlEnterCriticalSection, find any > > > mov rcx, <reg> before, then find the lea <reg>, [rip+XXX] (where reg would > > > be rcx if there wasn't a mov rcx after the lea). Unfortunately, the > > > variable length-ness doesn't lend itself to iterating backwards, so I am > > > not confirming that the lea actually ends up in rcx for the function call. > > > The only register correlation I do is that the register used in the > > > mov <reg>, QWORD PTR [rip+XXX] is then used in the next instruction that > > > must be test <reg>, <reg>. The old code required that <reg> to be rbx, > > > but I don't see any reason that rbx is required... > > > > Yeah, reading x86_64 backwards will lead to confusion. And no, rbx > > isn't required, any non-volatile register could do it. It seems that > > rbx is used because of the way vc++ allocates register. > > > After taking out the windows 8.0 case, I think this should be doable: > * when finding the lea that we're already looking for, save the > destination register > > * if the destination register is not rcx, look for a 64-bit mov into rcx > from <reg> (where <reg> is the register from the lea) before the call to > RtlEnterCriticalSection > > This won't catch cases where they shuffle it between multiple registers, > or otherwise obfusate the load into rcx (push/pop, xchg, using some memory > location, ...) but I think this covers every case I've seen (including > those mentioned in comments about preview builds). It would also allow us > to skip the theoretical-but-legal sequence (intel) > > lea rXX, [rip+XXXX] ; FastPebLock > ... > call UnrelatedFunction > mov rcx, rXX > call RtlEnterCriticalSection > mov rYY, QWORD PTR [rip+YYYY] ; RtlpCurDirRef > test rYY, rYY > ... > > I'll try to find some time to test this latest round on as many released > Windows versions >= 8.1 as I can, and then send a v3 series. It works on > 22631 at least.
This sounds great, but don't put too many effort into past preview releases. We try hard that Cygwin runs on released versions of Windows. But preview versions of the past are a thing of the past. Not only that, but you're putting a lot of effort into versions sometimes used by a single machine. It might further simplify the code if you don't handle these old temporary versions anymore and concentrate on the past releases. Btw., wouldn't you have fun to join our Libera IRC channel #cygwin-developers? https://cygwin.com/irc.html Thanks, Corinna