Re: Reorder/combine insns on superscalar arch

2016-01-15 Thread Richard Henderson
On 01/15/2016 06:06 AM, Bernd Schmidt wrote: > On 01/15/2016 07:05 AM, Jeff Law wrote: > >> Well, you have to write the pattern and a splitter. But these days >> there's define_insn_and_split to help with that. Reusing Bernd's work >> may ultimately be easier though. > > Maybe, but maybe also n

Re: Reorder/combine insns on superscalar arch

2016-01-15 Thread Bernd Schmidt
On 01/15/2016 07:05 AM, Jeff Law wrote: Well, you have to write the pattern and a splitter. But these days there's define_insn_and_split to help with that. Reusing Bernd's work may ultimately be easier though. Maybe, but maybe also not in the way you think. I've always wanted the ability to

Re: Reorder/combine insns on superscalar arch

2016-01-14 Thread Jeff Law
On 01/14/2016 10:45 PM, Igor Shevlyakov wrote: Thanks Jeff, I really hoped that I missed something and there was better answer. Nope, not really. Though thinking about it, you might want to look into Bernd's work from 2012 in the haifa scheduler -- it's got some intelligence for dependency br

Re: Reorder/combine insns on superscalar arch

2016-01-14 Thread Igor Shevlyakov
Thanks Jeff, I really hoped that I missed something and there was better answer. But does it do any harm if combiner will try to check every piece of a parallel like that and if every component is matchable and total cost is not worse to emit them separately? It will change nothing for single issu

Re: Reorder/combine insns on superscalar arch

2016-01-14 Thread Jeff Law
On 01/14/2016 04:47 PM, Igor Shevlyakov wrote: Guys, I'm trying to make compiler to generate better code on superscalar in-order machine but can't find the right way to do it. Imagine the following code: long f(long* p, long a, long b) { long a1 = a << 2; long a2 = a1 + b; return p[a1

Reorder/combine insns on superscalar arch

2016-01-14 Thread Igor Shevlyakov
Guys, I'm trying to make compiler to generate better code on superscalar in-order machine but can't find the right way to do it. Imagine the following code: long f(long* p, long a, long b) { long a1 = a << 2; long a2 = a1 + b; return p[a1] + p[a2]; } by default compiler generates somethin