Re: Worse code after bbro?

2017-01-04 Thread Jeff Law
On 01/04/2017 03:46 AM, Segher Boessenkool wrote: On Wed, Jan 04, 2017 at 10:05:49AM +0100, Richard Biener wrote: The code size is identical, but the trunk version executes one more instruction everytime the loop runs (explicit jump to .L5 with trunk vs fallthrough with 4.8) - it's faster only i

Re: GCC libatomic ABI specification draft

2017-01-04 Thread Szabolcs Nagy
On 22/12/16 17:37, Segher Boessenkool wrote: > We do not always have all atomic instructions. Not all processors have > all, and it depends on the compiler flags used which are used. How would > libatomic know what compiler flags are used to compile the program it is > linked to? > > Sounds like

Re: Worse code after bbro?

2017-01-04 Thread Segher Boessenkool
On Wed, Jan 04, 2017 at 10:05:49AM +0100, Richard Biener wrote: > > The code size is identical, but the trunk version executes one more > > instruction everytime the loop runs (explicit jump to .L5 with trunk vs > > fallthrough with 4.8) - it's faster only if the loop never runs. This > > happens i

Re: LTO remapping/deduction of machine modes of types/decls

2017-01-04 Thread Richard Biener
On Mon, 2 Jan 2017, Jakub Jelinek wrote: > On Mon, Jan 02, 2017 at 09:49:55PM +0300, Alexander Monakov wrote: > > On Mon, 2 Jan 2017, Jakub Jelinek wrote: > > > If the host has long double the same as double, sure, PTX can use its > > > native > > > DFmode even for long double. But otherwise, th

Re: LTO remapping/deduction of machine modes of types/decls

2017-01-04 Thread Richard Biener
On Mon, 2 Jan 2017, Jakub Jelinek wrote: > On Fri, Dec 30, 2016 at 08:40:11PM +0300, Alexander Monakov wrote: > > Hello, Richard, Jakub, community, > > > > May I join/restart the old discussion about machine mode remapping at LTO > > stream-in time. To recap, when offloading to NVPTX was introdu

Re: Worse code after bbro?

2017-01-04 Thread Richard Biener
On Wed, 21 Dec 2016, Senthil Kumar Selvaraj wrote: > Hi, > > For this C code (slightly modified from PR 30908) > > void wait(int i) > { > while (i-- > 0) > asm volatile("nop" ::: "memory"); > } > > gcc 4.8 at -Os produces > > jmp .L2 > .L3: > nop >