On Thu, 27 Mar 2025, Corinna Vinschen wrote: > Hi Jeremy, > > ok, three questions... > > Q1: How does it compare performancewise to the old code? Fortunately > it's only called once per process tree, so this shoudn't have much > impact, but still nice to know...
How would you go about finding that out? > Q2: Would you mind to add more comments? > > Q2a: A preceeding one line comment briefly explaining the function? > > > +static inline const void * > > +rip_rel_offset (const ud_t *ud_obj, const ud_operand_t *opr, int sub_off=0) rip_rel_offset helper function? OK... > Q2b: Comment like "Initializing" blah? > > > + ud_t ud_obj; > > + ud_init (&ud_obj); > > + ud_set_mode (&ud_obj, 64); > > + ud_set_input_buffer (&ud_obj, get_dir, 80); > > + ud_set_pc (&ud_obj, (const uint64_t) get_dir); > > - On Pre-Windows 8 we basically look for the RtlEnterCriticalSection > > call. > > - Windows 8 does not call RtlEnterCriticalSection. The code manipulates > > - the FastPebLock manually, probably because RtlEnterCriticalSection has > > - been converted to an inline function. Either way, we test if the code > > - uses the FastPebLock. */ > > + On Pre- (or Post-) Windows 8 we basically look for the > > Q3: or post? Really? AFAIK, this was only an issue on pre W8, so it > affects Vista and W7 only. Theoretically this check can go away, > unless you have proof this is still an issue in some later Windows > starting with 8.1. I haven't confirmed what pre-8 does, but 8.0 appears to inline RtlEnterCriticalSection while 8.1 and later call it. Based on the comment, it seems 8.0 is the odd-version-out here. > > + RtlEnterCriticalSection call. Windows 8 does not call > > + RtlEnterCriticalSection. The code manipulates the FastPebLock > > manually, > > + probably because RtlEnterCriticalSection has been converted to an > > inline > > + function. Either way, we test if the code uses the FastPebLock. */ > Q2c: Roundabout here, I'm getting the impression we're losing > more comments than we gain. This is not a good way to raise > confidence ;) > > Codewise I have nothing to carp at, but comments are a bit sparse > for my taste... I thought the only comments I lost were related to each insider version workaround that was piled on top that are no longer necessary. BTW, something I would *like* to do but haven't figured out how to accomplish cleanly yet is to follow the registers. What I mean by this is illustrated by what I did in the aarch64 version: I could find the call to RtlEnterCrticalSection, then work backwards, find the add whose Rd was x0 (the register for the first (pointer) parameter in the calling convention), then find the adrp whose Rd was the Rn of the add. What I would do on x86_64 is find the call to RtlEnterCriticalSection, find any mov rcx, <reg> before, then find the lea <reg>, [rip+XXX] (where reg would be rcx if there wasn't a mov rcx after the lea). Unfortunately, the variable length-ness doesn't lend itself to iterating backwards, so I am not confirming that the lea actually ends up in rcx for the function call. The only register correlation I do is that the register used in the mov <reg>, QWORD PTR [rip+XXX] is then used in the next instruction that must be test <reg>, <reg>. The old code required that <reg> to be rbx, but I don't see any reason that rbx is required...