On Tue, Sep 18, 2018 at 12:22:14PM +0200, Ilya Leoshkevich wrote: > > > > Am 17.09.2018 um 19:11 schrieb Segher Boessenkool > > <seg...@kernel.crashing.org>: > > > > On Mon, Sep 17, 2018 at 10:50:58AM +0200, Ilya Leoshkevich wrote: > >>> Am 14.09.2018 um 23:35 schrieb Segher Boessenkool > >>> <seg...@kernel.crashing.org>: > >>> Could you please show generated code before and after this patch? > >>> I mean generated assembler code. What -S gives you. > >> > >> Before: > >> > >> foo4: > >> .LFB0: > >> .cfi_startproc > >> lt %r1,0(%r2) > >> jne .L2 > >> lhi %r3,1 > >> cs %r1,%r3,0(%r2) > >> .L2: > >> jne .L5 > >> br %r14 > >> .L5: > >> jg bar > >> > >> After: > >> > >> foo4: > >> .LFB0: > >> .cfi_startproc > >> lt %r1,0(%r2) > >> jne .L4 > >> lhi %r3,1 > >> cs %r1,%r3,0(%r2) > >> jne .L4 > >> br %r14 > >> .L4: > >> jg bar > > > > Ah. And a compiler of some weeks old gives > > > > foo4: > > .LFB0: > > .cfi_startproc > > lhi %r3,0 > > lhi %r4,1 > > cs %r3,%r4,0(%r2) > > jne .L4 > > br %r14 > > .L4: > > jg bar > > > > so this is all caused by the recent optimisation that avoids the "cs" if > > it can. > > Could you please try building with -march=z13? I don’t see the „lt“ > instruction in your output. On z196+ we try to speed up the code by > jumping around the „cs“ when possible.
This was a few weeks old source (because it didn't bootstrap elsewhere). This is s390-linux-gcc -Wall -W -march=z196 -O2 as in the PR (and the compiler has largely default options). -march=z13 has identical output on that compiler. Segher