On 03/07/2025 3:07 pm, Jan Beulich wrote:
> On 03.07.2025 13:59, Andrew Cooper wrote:
>> On 03/07/2025 10:01 am, Jan Beulich wrote:
>>> On 02.07.2025 16:41, Andrew Cooper wrote:
>>>> @@ -452,6 +466,10 @@ void mwait_idle_with_hints(unsigned int eax, unsigned 
>>>> int ecx)
>>>>          mwait(eax, ecx);
>>>>          spec_ctrl_exit_idle(info);
>>>>      }
>>>> +
>>>> +    alternative_io("movb $0, %[in_mwait]",
>>>> +                   "", X86_BUG_MONITOR,
>>>> +                   [in_mwait] "=m" (stat->in_mwait));
>>> This doesn't strictly need to be an alternative, does it? We can save a
>>> memory write in the buggy case, but that likely makes at most a marginal
>>> difference.
>> It doesn't *need* to be an alternative.  It could be:
>>
>> if ( !cpu_bug_monitor )
>>     ACCESS_ONCE(stat->in_mwait) = true;
>>
>> but getting rid of the branch is an advantage too.
> That's the other alternative blob. What I mean that here it could simply
> be
>
>     ACCESS_ONCE(stat->in_mwait) = false;
>
> without any conditional.

It could.  Or it could be consistent with it's other half.

>
>>>> --- a/xen/arch/x86/include/asm/hardirq.h
>>>> +++ b/xen/arch/x86/include/asm/hardirq.h
>>>> @@ -5,7 +5,19 @@
>>>>  #include <xen/types.h>
>>>>  
>>>>  typedef struct {
>>>> -    unsigned int __softirq_pending;
>>>> +    /*
>>>> +     * The layout is important.  Any CPU can set bits in 
>>>> __softirq_pending,
>>>> +     * but in_mwait is a status bit owned by the CPU.  softirq_mwait_raw 
>>>> must
>>>> +     * cover both, and must be in a single cacheline.
>>>> +     */
>>>> +    union {
>>>> +        struct {
>>>> +            unsigned int __softirq_pending;
>>>> +            bool in_mwait;
>>>> +        };
>>>> +        uint64_t softirq_mwait_raw;
>>>> +    };
>>> To guard against someone changing e.g. __softirq_pending to unsigned long
>>> while ignoring the comment, how about adding a BUILD_BUG_ON() checking both
>>> parts of the union are the same size (which of course would require naming
>>> either the union field within the containing struct or its struct member)?
>> That turns out to be very hard because of the definition of
>> softirq_pending() being common.  More below.
> Hmm, yes, I see. Then maybe
>
>     BUILD_BUG_ON(offsetof(irq_cpustat_t, in_mwait) +
>                    sizeof(irq_stat[0].in_mwait) >
>                  offsetof(irq_cpustat_t, softirq_mwait_raw) +
>                    sizeof(irq_stat[0].softirq_mwait_raw));
>
> (or something substantially similar using e.g. typeof())?
>
>>>> @@ -9,4 +11,36 @@
>>>>  #define HVM_DPCI_SOFTIRQ       (NR_COMMON_SOFTIRQS + 4)
>>>>  #define NR_ARCH_SOFTIRQS       5
>>>>  
>>>> +/*
>>>> + * Ensure softirq @nr is pending on @cpu.  Return true if an IPI can be
>>>> + * skipped, false if the IPI cannot be skipped.
>>>> + *
>>>> + * We use a CMPXCHG covering both __softirq_pending and in_mwait, in 
>>>> order to
>>>> + * set softirq @nr while also observing in_mwait in a race-free way.
>>>> + */
>>>> +static always_inline bool arch_pend_softirq(unsigned int nr, unsigned int 
>>>> cpu)
>>>> +{
>>>> +    uint64_t *ptr = &irq_stat[cpu].softirq_mwait_raw;
>>>> +    uint64_t old, new;
>>>> +    unsigned int softirq = 1U << nr;
>>>> +
>>>> +    old = ACCESS_ONCE(*ptr);
>>>> +
>>>> +    do {
>>>> +        if ( old & softirq )
>>>> +            /* Softirq already pending, nothing to do. */
>>>> +            return true;
>>>> +
>>>> +        new = old | softirq;
>>>> +
>>>> +    } while ( (old = cmpxchg(ptr, old, new)) != new );
>>> Don't you mean
>>>
>>>     } while ( (new = cmpxchg(ptr, old, new)) != old );
>>>
>>> here
>> No. I'm pretty sure I don't.
>>
>>> (with new latched ahead of the loop and old set from new inside it),
>>> like we have it elsewhere when we avoid the use of yet another variable,
>>> e.g. in inc_linear_{entries,uses}()?
>> Eww, no.  Having new and old backwards like that is borderline
>> obfucation, and is unnecessary cognitive complexity for the reader.
> Why backwards? You want to compare the (original) value read against the
> expected old value. The way you wrote it you'll do at least one extra
> loop iteration, as you wait for the fetched (original) value to equal
> "new".

Hmm, yes, that's not quite what I intended, but I'm also not happy
writing anything of the form "new = cmpxchg()".  It's plain wrong for
the semantics of cmpxchg.

~Andrew

Reply via email to