subject:"\[PR 80689\] Copy small aggregates element\-wise"

Re: [PR 80689] Copy small aggregates element-wise

2018-07-31 Thread Richard Biener

On Tue, Jul 24, 2018 at 3:47 PM Martin Jambor wrote: > > Hi, > > I'd like to propose again a new variant of a fix that I sent here in > November (https://gcc.gnu.org/ml/gcc-patches/2017-10/msg00881.html) that > avoids store-to-load forwarding stalls in the ImageMagick benchmark by > expanding copi

[PR 80689] Copy small aggregates element-wise

2018-07-24 Thread Martin Jambor

Hi, I'd like to propose again a new variant of a fix that I sent here in November (https://gcc.gnu.org/ml/gcc-patches/2017-10/msg00881.html) that avoids store-to-load forwarding stalls in the ImageMagick benchmark by expanding copies of very small simple aggregates element-wise rather than "by pie

Re: [RFC, PR 80689] Copy small aggregates element-wise

2017-11-24 Thread Richard Biener

On Fri, Nov 24, 2017 at 2:00 PM, Martin Jambor wrote: > On Fri, Nov 24 2017, Richard Biener wrote: >> On Fri, Nov 24, 2017 at 12:53 PM, Martin Jambor wrote: >>> Hi Richi, >>> >>> On Fri, Nov 24 2017, Richard Biener wrote: On Fri, Nov 24, 2017 at 11:57 AM, Richard Biener wrote: > On

Re: [RFC, PR 80689] Copy small aggregates element-wise

2017-11-24 Thread Martin Jambor

On Fri, Nov 24 2017, Richard Biener wrote: > On Fri, Nov 24, 2017 at 12:53 PM, Martin Jambor wrote: >> Hi Richi, >> >> On Fri, Nov 24 2017, Richard Biener wrote: >>> On Fri, Nov 24, 2017 at 11:57 AM, Richard Biener >>> wrote: On Fri, Nov 24, 2017 at 11:31 AM, Richard Biener >> >> .. >> >

Re: [RFC, PR 80689] Copy small aggregates element-wise

2017-11-24 Thread Richard Biener

On Fri, Nov 24, 2017 at 12:53 PM, Martin Jambor wrote: > Hi Richi, > > On Fri, Nov 24 2017, Richard Biener wrote: >> On Fri, Nov 24, 2017 at 11:57 AM, Richard Biener >> wrote: >>> On Fri, Nov 24, 2017 at 11:31 AM, Richard Biener > > .. > >> And yes, I've been worried about SRA as well here...

Re: [RFC, PR 80689] Copy small aggregates element-wise

2017-11-24 Thread Martin Jambor

Hi Richi, On Fri, Nov 24 2017, Richard Biener wrote: > On Fri, Nov 24, 2017 at 11:57 AM, Richard Biener > wrote: >> On Fri, Nov 24, 2017 at 11:31 AM, Richard Biener .. > And yes, I've been worried about SRA as well here... it _does_ > have some early outs when seeing VIEW_CONVERT_EXPR

Re: [RFC, PR 80689] Copy small aggregates element-wise

2017-11-24 Thread Richard Biener

On Fri, Nov 24, 2017 at 11:57 AM, Richard Biener wrote: > On Fri, Nov 24, 2017 at 11:31 AM, Richard Biener > wrote: >> On Thu, Nov 23, 2017 at 4:32 PM, Martin Jambor wrote: >>> Hi, >>> >>> On Mon, Nov 13 2017, Richard Biener wrote: The main concern here is that GIMPLE is not very well defin

Re: [RFC, PR 80689] Copy small aggregates element-wise

2017-11-24 Thread Richard Biener

On Fri, Nov 24, 2017 at 11:31 AM, Richard Biener wrote: > On Thu, Nov 23, 2017 at 4:32 PM, Martin Jambor wrote: >> Hi, >> >> On Mon, Nov 13 2017, Richard Biener wrote: >>> The main concern here is that GIMPLE is not very well defined for >>> aggregate copies and that gimple-fold.c happily optimiz

Re: [RFC, PR 80689] Copy small aggregates element-wise

2017-11-24 Thread Richard Biener

On Thu, Nov 23, 2017 at 4:32 PM, Martin Jambor wrote: > Hi, > > On Mon, Nov 13 2017, Richard Biener wrote: >> The main concern here is that GIMPLE is not very well defined for >> aggregate copies and that gimple-fold.c happily optimizes >> memcpy (&a, &b, sizeof (a)) into a = b; >> >> struct A { s

Re: [RFC, PR 80689] Copy small aggregates element-wise

2017-11-23 Thread Jakub Jelinek

On Thu, Nov 23, 2017 at 04:32:43PM +0100, Martin Jambor wrote: > > struct A { short s; long i; long j; }; > > struct A a, b; > > void foo () > > { > > struct A c; > > __builtin_memcpy (&c, &b, sizeof (struct A)); > > __builtin_memcpy (&a, &c, sizeof (struct A)); > > } > > int main() > > { > >

Re: [RFC, PR 80689] Copy small aggregates element-wise

2017-11-23 Thread Martin Jambor

Hi, On Mon, Nov 13 2017, Richard Biener wrote: > The main concern here is that GIMPLE is not very well defined for > aggregate copies and that gimple-fold.c happily optimizes > memcpy (&a, &b, sizeof (a)) into a = b; > > struct A { short s; long i; long j; }; > struct A a, b; > void foo () > { >

Re: [RFC, PR 80689] Copy small aggregates element-wise

2017-11-14 Thread Martin Jambor

Hi, I thought I sent the following email last Friday but found it in my drafts folder right now, so let me send it now so that anybody interested can see what the patch does on Haswell. I have only skimmed through new messages in the thread. I am now looking into something else right now but wil

Re: [RFC, PR 80689] Copy small aggregates element-wise

2017-11-13 Thread Eric Botcazou

> The chance here is, of course (find the PR, it exists...), that SRA then > decomposes the char[] copy bytewise... > > That said, memcpy folding is easy to fix. The question is of course > what the semantic of VIEW_CONVERTs is (SRA _does_ contain > bail-outs on those). Like if you have > > str

Re: [RFC, PR 80689] Copy small aggregates element-wise

2017-11-13 Thread Richard Biener

On November 13, 2017 3:20:16 PM GMT+01:00, Michael Matz wrote: >Hi, > >On Mon, 13 Nov 2017, Richard Biener wrote: > >> The chance here is, of course (find the PR, it exists...), that SRA >then >> decomposes the char[] copy bytewise... >> >> That said, memcpy folding is easy to fix. The question

Re: [RFC, PR 80689] Copy small aggregates element-wise

2017-11-13 Thread Michael Matz

Hi, On Mon, 13 Nov 2017, Richard Biener wrote: > The chance here is, of course (find the PR, it exists...), that SRA then > decomposes the char[] copy bytewise... > > That said, memcpy folding is easy to fix. The question is of course > what the semantic of VIEW_CONVERTs is (SRA _does_ contain

Re: [RFC, PR 80689] Copy small aggregates element-wise

2017-11-13 Thread Richard Biener

On Mon, Nov 13, 2017 at 2:46 PM, Michael Matz wrote: > Hi, > > On Mon, 13 Nov 2017, Richard Biener wrote: > >> The main concern here is that GIMPLE is not very well defined for >> aggregate copies and that gimple-fold.c happily optimizes >> memcpy (&a, &b, sizeof (a)) into a = b; > > What you miss

Re: [RFC, PR 80689] Copy small aggregates element-wise

2017-11-13 Thread Michael Matz

Hi, On Mon, 13 Nov 2017, Richard Biener wrote: > The main concern here is that GIMPLE is not very well defined for > aggregate copies and that gimple-fold.c happily optimizes > memcpy (&a, &b, sizeof (a)) into a = b; What you missed to mention is that we then discussed about rectifying this sit

Re: [RFC, PR 80689] Copy small aggregates element-wise

2017-11-13 Thread Richard Biener

On Fri, Nov 3, 2017 at 5:38 PM, Martin Jambor wrote: > Hi, > > On Thu, Oct 26, 2017 at 02:43:02PM +0200, Richard Biener wrote: >> On Thu, Oct 26, 2017 at 2:18 PM, Martin Jambor wrote: >> > >> > Nevertheless, I still intend to experiment with the limit, I sent out >> > this RFC exactly so that I d

Re: [RFC, PR 80689] Copy small aggregates element-wise

2017-11-03 Thread Martin Jambor

Hi, On Thu, Oct 26, 2017 at 02:43:02PM +0200, Richard Biener wrote: > On Thu, Oct 26, 2017 at 2:18 PM, Martin Jambor wrote: > > > > Nevertheless, I still intend to experiment with the limit, I sent out > > this RFC exactly so that I don't spend a lot of time benchmarking > > something that is eve

Re: [RFC, PR 80689] Copy small aggregates element-wise

2017-10-27 Thread Jan Hubicka

> On Thu, Oct 26, 2017 at 2:55 PM, Jan Hubicka wrote: > >> I think the limit should be on the number of generated copies and not > >> the overall size of the structure... If the struct were composed of > >> 32 individual chars we wouldn't want to emit 32 loads and 32 stores... > >> > >> I wonder

Re: [RFC, PR 80689] Copy small aggregates element-wise

2017-10-26 Thread Richard Biener

On Thu, Oct 26, 2017 at 4:38 PM, Richard Biener wrote: > On Thu, Oct 26, 2017 at 2:55 PM, Jan Hubicka wrote: >>> I think the limit should be on the number of generated copies and not >>> the overall size of the structure... If the struct were composed of >>> 32 individual chars we wouldn't want

Re: [RFC, PR 80689] Copy small aggregates element-wise

2017-10-26 Thread Richard Biener

On Thu, Oct 26, 2017 at 2:55 PM, Jan Hubicka wrote: >> I think the limit should be on the number of generated copies and not >> the overall size of the structure... If the struct were composed of >> 32 individual chars we wouldn't want to emit 32 loads and 32 stores... >> >> I wonder how rep; mov

Re: [RFC, PR 80689] Copy small aggregates element-wise

2017-10-26 Thread Michael Matz

Hi, On Thu, 26 Oct 2017, Martin Jambor wrote: > > 35 bytes seems to be much - what is the code-size impact? > > I will find out and report on that. I need at least 32 bytes (four > long ints) to fix imagemagick, where the problematic structure is: Surely the final heuristic should look at the

Re: [RFC, PR 80689] Copy small aggregates element-wise

2017-10-26 Thread Jan Hubicka

> I think the limit should be on the number of generated copies and not > the overall size of the structure... If the struct were composed of > 32 individual chars we wouldn't want to emit 32 loads and 32 stores... > > I wonder how rep; movb; interacts with store to load forwarding? Is > that ma

Re: [RFC, PR 80689] Copy small aggregates element-wise

2017-10-26 Thread Richard Biener

On Thu, Oct 26, 2017 at 2:18 PM, Martin Jambor wrote: > Hi, > > On Tue, Oct 17, 2017 at 01:34:54PM +0200, Richard Biener wrote: >> On Fri, Oct 13, 2017 at 6:13 PM, Martin Jambor wrote: >> > Hi, >> > >> > I'd like to request comments to the patch below which aims to fix PR >> > 80689, which is an

Re: [RFC, PR 80689] Copy small aggregates element-wise

2017-10-26 Thread Martin Jambor

Hi, On Tue, Oct 17, 2017 at 01:34:54PM +0200, Richard Biener wrote: > On Fri, Oct 13, 2017 at 6:13 PM, Martin Jambor wrote: > > Hi, > > > > I'd like to request comments to the patch below which aims to fix PR > > 80689, which is an instance of a store-to-load forwarding stall on > > x86_64 CPUs i

Re: [RFC, PR 80689] Copy small aggregates element-wise

2017-10-17 Thread Richard Biener

On Fri, Oct 13, 2017 at 6:13 PM, Martin Jambor wrote: > Hi, > > I'd like to request comments to the patch below which aims to fix PR > 80689, which is an instance of a store-to-load forwarding stall on > x86_64 CPUs in the Image Magick benchmark, which is responsible for a > slow down of up to 9%

[RFC, PR 80689] Copy small aggregates element-wise

2017-10-13 Thread Martin Jambor

Hi, I'd like to request comments to the patch below which aims to fix PR 80689, which is an instance of a store-to-load forwarding stall on x86_64 CPUs in the Image Magick benchmark, which is responsible for a slow down of up to 9% compared to gcc 6, depending on options and HW used. (Actually, I

Re: [PR 80689] Copy small aggregates element-wise

[PR 80689] Copy small aggregates element-wise

Re: [RFC, PR 80689] Copy small aggregates element-wise

Re: [RFC, PR 80689] Copy small aggregates element-wise

Re: [RFC, PR 80689] Copy small aggregates element-wise

Re: [RFC, PR 80689] Copy small aggregates element-wise

Re: [RFC, PR 80689] Copy small aggregates element-wise

Re: [RFC, PR 80689] Copy small aggregates element-wise

Re: [RFC, PR 80689] Copy small aggregates element-wise

Re: [RFC, PR 80689] Copy small aggregates element-wise

Re: [RFC, PR 80689] Copy small aggregates element-wise

Re: [RFC, PR 80689] Copy small aggregates element-wise

Re: [RFC, PR 80689] Copy small aggregates element-wise

Re: [RFC, PR 80689] Copy small aggregates element-wise

Re: [RFC, PR 80689] Copy small aggregates element-wise

Re: [RFC, PR 80689] Copy small aggregates element-wise

Re: [RFC, PR 80689] Copy small aggregates element-wise

Re: [RFC, PR 80689] Copy small aggregates element-wise

Re: [RFC, PR 80689] Copy small aggregates element-wise

Re: [RFC, PR 80689] Copy small aggregates element-wise

Re: [RFC, PR 80689] Copy small aggregates element-wise

Re: [RFC, PR 80689] Copy small aggregates element-wise

Re: [RFC, PR 80689] Copy small aggregates element-wise

Re: [RFC, PR 80689] Copy small aggregates element-wise

Re: [RFC, PR 80689] Copy small aggregates element-wise

Re: [RFC, PR 80689] Copy small aggregates element-wise

Re: [RFC, PR 80689] Copy small aggregates element-wise

[RFC, PR 80689] Copy small aggregates element-wise

28 matches

Site Navigation

Mail list logo

Footer information