On Wed, Oct 22, 2014 at 5:28 PM, Jan Beulich <jbeul...@suse.com> wrote:
> I noticed the issue with 4.9.1 (in that x86 Linux'es
> this_cpu_read_stable() no longer does what the comment preceding
> its definition promises), and the example below demonstrates this in
> a simplified (but contrived) way. I just now verified that trunk has
> the same issue; 4.8.3 still folds redundant ones as expected. Is this
> known, or possibly even intended (in which case I'd be curious as to
> what the reasons are, and how the functionality Linux wants can be
> gained back)?

For 4.8 the CSE happened at the RTL level.  On the GIMPLE level
we inline too early to CSE based on the fact the functions are pure.
Note that dummy() may change m and p, so what 4.8 did was bogus:

#APP
# 7 "t.c" 1
        nop m(%rip)
# 0 "" 2
#NO_APP
        movl    %edi, %esi
#APP
# 14 "t.c" 1
        nop p
# 0 "" 2
#NO_APP
        call    dummy
        movl    %ebx, %esi
        movl    %ebx, %edi
        call    dummy

I suppose the fix for that also broke the CSE.

Richard.

> Thanks, Jan
>
> void dummy(int, int);
> extern int m, p;
>
> static inline int read_m(void) {
>         int i;
>
>         asm("nop %1" : "=r" (i) : "m" (m));
>         return i;
> }
>
> static inline int read_p(void) {
>         int i;
>
>         asm("nop %P1" : "=r" (i) : "p" (&p));
>         return i;
> }
>
> void test(void) {
>         dummy(read_m(), read_m());
>         dummy(read_p(), read_p());
>         dummy(read_m(), read_m());
>         dummy(read_p(), read_p());
> }
>
>

Reply via email to