https://sourceware.org/bugzilla/show_bug.cgi?id=34343

            Bug ID: 34343
           Summary: "optimize XCHG to MOV" breaks valgrind
           Product: binutils
           Version: 2.47 (HEAD)
            Status: NEW
          Severity: normal
          Priority: P2
         Component: gas
          Assignee: unassigned at sourceware dot org
          Reporter: hjl.tools at gmail dot com
  Target Milestone: ---
            Target: x86

commit 1c3c3e4b3c2ac2eed9abcbce0b9cba1be10ed3f0
Author: Jan Beulich <[email protected]>
Date:   Fri Jun 19 09:47:21 2026 +0200

    x86: optimize XCHG to MOV for same-register forms

breaks valgrind:

https://bugs.kde.org/show_bug.cgi?id=522533

According to Mark Wielaard <[email protected]>:

This comment from VEX/priv/guest_amd64_toIR.c might explain it best:

/* "Special" instructions.                                           

   This instruction decoder can decode three special instructions     
   which mean nothing natively (are no-ops as far as regs/mem are     
   concerned) but have meaning for supporting Valgrind.  A special   
   instruction is flagged by the 16-byte preamble 48C1C703 48C1C70D   
   48C1C73D 48C1C733 (in the standard interpretation, that means: rolq
   $3, %rdi; rolq $13, %rdi; rolq $61, %rdi; rolq $51, %rdi).         
   Following that, one of the following 3 are allowed (standard       
   interpretation in parentheses):                                   

      4887DB (xchgq %rbx,%rbx)   %RDX = client_request ( %RAX )       
      4887C9 (xchgq %rcx,%rcx)   %RAX = guest_NRADDR                 
      4887D2 (xchgq %rdx,%rdx)   call-noredir *%RAX                   
      4887F6 (xchgq %rdi,%rdi)   IR injection                         

   Any other bytes following the 16-byte preamble are illegal and     
   constitute a failure in instruction decoding.  This all assumes   
   that the preamble will never occur except in specific code         
   fragments designed for Valgrind to catch.                         

   No prefixes may precede a "Special" instruction.                   
*/                                                                   

Which is "implemented" in valgrind.h (which applications include to
insert these special instructions into their executable to support
various valgrind "client requests") as James explained:

> We die in vg_preloaded.c:129 which is:
>
> void * VG_NOTIFY_ON_LOAD(ifunc_wrapper) (void)
> ...
>     /* Call the original indirect function and get it's result */
>     VALGRIND_GET_ORIG_FN(fn); /* <-- */
>     CALL_FN_W_v(result, fn);
>
> which is VALGRIND_GET_NR_CONTEXT:
>
> #define VALGRIND_GET_NR_CONTEXT(_zzq_rlval)                       \
>   { volatile OrigFn* _zzq_orig = &(_zzq_rlval);                   \
>     volatile unsigned int __addr;                                 \
>     __asm__ volatile(__SPECIAL_INSTRUCTION_PREAMBLE               \
>                      /* %EAX = guest_NRADDR */                    \
>                      "xchgl %%ecx,%%ecx"                          \
>                      : "=a" (__addr)                              \
>                      :                                            \
>                      : "cc", "memory"                             \
>                     );                                            \
>     _zzq_orig->nraddr = __addr;                                   \
>   }
>
> and the rols are the _S_I_P macro:
>
> #define __SPECIAL_INSTRUCTION_PREAMBLE                            \
>                      "roll $3,  %%edi ; roll $13, %%edi\n\t"      \
>                      "roll $29, %%edi ; roll $19, %%edi\n\t"
>
> but what I don't understand is why it ends up seeing that. I think when
> it sees _S_I_P, it is supposed to rewrite it (?), but I am not an expert on
> valgrind's VEX interpreter at all.

It is supposed to sees _S_I_P followed by one of the special xchg
instructions, but now sees _S_I_P followed by a mov which confuses the
instruction decoder.

> > And why is -O passed to gas there, when specific
> > insn selection matters?
>
> That's just because it's one of the things I test in some runs. I'll
> filter it out for Valgrind as I agree it makes no sense there, but
> another problem  happens when Valgrind itself is built without it, but
> e.g. systemd has -Wa,-O2:

Yeah, we don't control how the object files that include the inlined
assembly in valgrind.h is compiled.

Also note that valgrind.h is often vendored into other code bases
because it is meant to be useful standalone. Which means we cannot
change the special instruction sequence or the inline assembly used to
generate it.

So we need a solution that prevents this particular xchg to mov
translation (at least for same register ones) even if the sequence is
compiled with gas optimizations.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

Reply via email to