On 05/06/2025 11:25 am, Jan Beulich wrote:
> --- a/xen/arch/x86/alternative.c
> +++ b/xen/arch/x86/alternative.c
> @@ -195,12 +195,16 @@ void *place_ret(void *ptr)
>   * executing.
>   *
>   * "noinline" to cause control flow change and thus invalidate I$ and
> - * cause refetch after modification.
> + * cause refetch after modification.  While the SDM continues to suggest this
> + * is sufficient, it may not be - issue a serializing insn afterwards as 
> well,
> + * unless this is for live-patching.
>   */
>  static void init_or_livepatch noinline
>  text_poke(void *addr, const void *opcode, size_t len)
>  {
>      memcpy(addr, opcode, len);
> +    if ( system_state < SYS_STATE_active )
> +        asm volatile ( "mov %%rax, %%cr2" ::: "memory" );
>  }

This hunk wants pulling out separately.  It's not really related to
memcpy(), and probably ought to be backported.

Architecturally, both the APM and SDM say you're ok doing nothing on
64bit capable CPUs.

However, there are errata, and at least one recent AMD CPU needs
serialisation for self modifying code.  (Not sure if the rev guide has
been updated yet, and I can't remember offhand which CPU it is.)

You should also discuss the choice of serialising instruction in a
comment.  Mov to %cr2 is serialising on everything more modern than the
486, and least likely to be intercepted under virt (== most performant).

You also need to explain that we only do true SMC during boot.  After
boot for livepatch, it's prior to addr going live.

Finally, you're loading rubble into %cr2.  It's not even reliably 'addr'
because of the transformations the compiler is permitted to make on
memcpy().  I really think you should be reliably feeding in 0.  We're
doing self-modifying code here with serialisation; an extra xor won't be
measurable.

~Andrew

P.S. We should drop the noinline.  That's only applicable to the P5(?)
and earlier where you had to use branches to flush the prefetch queue. 
It's irrelevant on 64bit capable CPUs.

Reply via email to