Re: [PR94092] Re: [RFC] test builtin ratio for loop distribution

2021-05-03 Thread Alexandre Oliva
On May 4, 2021, Prathamesh Kulkarni wrote: > It looks like constfun's prototype had a typo with missing 2nd param for void > *. Ugh, the patch for https://gcc.gnu.org/PR90773 added the param the day after I retested the patch, and I did not give it yet another spin before checking it in :-( >

Re: [PR94092] Re: [RFC] test builtin ratio for loop distribution

2021-05-03 Thread Prathamesh Kulkarni via Gcc-patches
On Tue, 4 May 2021 at 07:30, Alexandre Oliva wrote: > > On May 3, 2021, Richard Biener wrote: > > > On Fri, Apr 30, 2021 at 4:42 PM Jeff Law wrote: > >> > >> > >> On 4/28/2021 10:26 PM, Alexandre Oliva wrote: > >> > On Feb 22, 2021, Richard Biener wrote: > >> > > >> >> On Fri, Feb 19, 2021 at

Re: [PR94092] Re: [RFC] test builtin ratio for loop distribution

2021-05-03 Thread Alexandre Oliva
On May 3, 2021, Richard Biener wrote: > On Fri, Apr 30, 2021 at 4:42 PM Jeff Law wrote: >> >> >> On 4/28/2021 10:26 PM, Alexandre Oliva wrote: >> > On Feb 22, 2021, Richard Biener wrote: >> > >> >> On Fri, Feb 19, 2021 at 9:08 AM Alexandre Oliva wrote: >> >>> Here's an improved version of t

Re: [PR94092] Re: [RFC] test builtin ratio for loop distribution

2021-05-03 Thread Richard Biener via Gcc-patches
On Fri, Apr 30, 2021 at 4:42 PM Jeff Law wrote: > > > On 4/28/2021 10:26 PM, Alexandre Oliva wrote: > > On Feb 22, 2021, Richard Biener wrote: > > > >> On Fri, Feb 19, 2021 at 9:08 AM Alexandre Oliva wrote: > >>> Here's an improved version of the patch. Regstrapped on > >>> x86_64-linux-gnu, wi

Re: [PR94092] Re: [RFC] test builtin ratio for loop distribution

2021-04-30 Thread Jeff Law via Gcc-patches
On 4/28/2021 10:26 PM, Alexandre Oliva wrote: On Feb 22, 2021, Richard Biener wrote: On Fri, Feb 19, 2021 at 9:08 AM Alexandre Oliva wrote: Here's an improved version of the patch. Regstrapped on x86_64-linux-gnu, with and without a patchlet that moved multi-pieces ahead of setmem, and al

Re: [PR94092] Re: [RFC] test builtin ratio for loop distribution

2021-04-28 Thread Alexandre Oliva
On Feb 22, 2021, Richard Biener wrote: > On Fri, Feb 19, 2021 at 9:08 AM Alexandre Oliva wrote: >> >> Here's an improved version of the patch. Regstrapped on >> x86_64-linux-gnu, with and without a patchlet that moved multi-pieces >> ahead of setmem, and also tested with riscv32-elf. >> >> Is

Re: [PR94092] Re: [RFC] test builtin ratio for loop distribution

2021-02-22 Thread Richard Biener via Gcc-patches
On Fri, Feb 19, 2021 at 9:08 AM Alexandre Oliva wrote: > > Here's an improved version of the patch. Regstrapped on > x86_64-linux-gnu, with and without a patchlet that moved multi-pieces > ahead of setmem, and also tested with riscv32-elf. > > Is it ok to install? Or should it wait for stage1?

[PR94092] Re: [RFC] test builtin ratio for loop distribution

2021-02-19 Thread Alexandre Oliva
Here's an improved version of the patch. Regstrapped on x86_64-linux-gnu, with and without a patchlet that moved multi-pieces ahead of setmem, and also tested with riscv32-elf. Is it ok to install? Or should it wait for stage1? [PR94092] introduce try store by multiple pieces From: Alexandre

Re: [RFC] test builtin ratio for loop distribution

2021-02-16 Thread Richard Biener via Gcc-patches
On Tue, Feb 16, 2021 at 11:48 AM Alexandre Oliva wrote: > > On Feb 16, 2021, Alexandre Oliva wrote: > > >> So I wonder whether we should instead re-run CCP after loop opts which > >> computes nonzero bits as well instead of the above "hack". > > That works. It takes care of both the dest alignme

Re: [RFC] test builtin ratio for loop distribution

2021-02-16 Thread Alexandre Oliva
On Feb 16, 2021, Alexandre Oliva wrote: >> So I wonder whether we should instead re-run CCP after loop opts which >> computes nonzero bits as well instead of the above "hack". That works. It takes care of both the dest alignment and the len ctz. Explicitly masking out the len tz from nonzero b

Re: [RFC] test builtin ratio for loop distribution

2021-02-15 Thread Alexandre Oliva
On Feb 12, 2021, Richard Biener wrote: >> + if (TREE_CODE (mem) == SSA_NAME) >> +if (ptr_info_def *pi = get_ptr_info (mem)) >> + { >> + unsigned al = get_pointer_alignment (builtin->dst_base); >> + if (al > pi->align || pi->misalign) > We still might prefer pi->align == 64

Re: [RFC] test builtin ratio for loop distribution

2021-02-12 Thread Richard Biener via Gcc-patches
On Thu, Feb 11, 2021 at 11:19 AM Alexandre Oliva wrote: > > On Feb 4, 2021, Alexandre Oliva wrote: > > > On Feb 4, 2021, Richard Biener wrote: > >>> > b) if expansion would use BY_PIECES then expand to an unrolled loop > >>> > >>> Why would that be better than keeping the constant-length mems

Re: [RFC] test builtin ratio for loop distribution

2021-02-11 Thread Alexandre Oliva
On Feb 11, 2021, Alexandre Oliva wrote: > How does this look? > for gcc/ChangeLog > PR tree-optimization/94092 > * builtins.c (try_store_by_multiple_pieces): New. > (expand_builtin_memset_args): Use it. If target_char_cast > fails, proceed as for non-constant val. Pa

Re: [RFC] test builtin ratio for loop distribution

2021-02-11 Thread Alexandre Oliva
On Feb 4, 2021, Alexandre Oliva wrote: > On Feb 4, 2021, Richard Biener wrote: >>> > b) if expansion would use BY_PIECES then expand to an unrolled loop >>> >>> Why would that be better than keeping the constant-length memset call, >>> that would be turned into an unrolled loop during expand

Re: [RFC] test builtin ratio for loop distribution

2021-02-11 Thread Alexandre Oliva
On Feb 4, 2021, Jim Wilson wrote: > FYI we have a bug report for this for a coremark regression which sounds > like the same problem. > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94092 Indeed, thanks! -- Alexandre Oliva, happy hacker https://FSFLA.org/blogs/lxo/ Free Software Activ

Re: [RFC] test builtin ratio for loop distribution

2021-02-05 Thread Richard Biener via Gcc-patches
On Thu, Feb 4, 2021 at 11:18 PM Alexandre Oliva wrote: > > On Feb 4, 2021, Richard Biener wrote: > > >> > b) if expansion would use BY_PIECES then expand to an unrolled loop > >> > >> Why would that be better than keeping the constant-length memset call, > >> that would be turned into an unroll

Re: [RFC] test builtin ratio for loop distribution

2021-02-04 Thread Jim Wilson
On Wed, Jan 27, 2021 at 4:40 AM Alexandre Oliva wrote: > This patch attempts to fix a libgcc codegen regression introduced in > gcc-10, as -ftree-loop-distribute-patterns was enabled at -O2. > > RISC-V doesn't have any setmemM pattern, so the loops above end up > "optimized" into memset calls, in

Re: [RFC] test builtin ratio for loop distribution

2021-02-04 Thread Alexandre Oliva
On Feb 4, 2021, Richard Biener wrote: >> > b) if expansion would use BY_PIECES then expand to an unrolled loop >> >> Why would that be better than keeping the constant-length memset call, >> that would be turned into an unrolled loop during expand? > Well, because of the possibly lost ctz and

Re: [RFC] test builtin ratio for loop distribution

2021-02-04 Thread Richard Biener via Gcc-patches
On Wed, Feb 3, 2021 at 4:11 PM Alexandre Oliva wrote: > > On Feb 3, 2021, Richard Biener wrote: > > > So I think we should try to match what __builtin_memcpy/memset > > expansion would do here, taking advantage of extra alignment > > and size knowledge. In particular, > > > a) if __builtin_mem

Re: [RFC] test builtin ratio for loop distribution

2021-02-03 Thread Alexandre Oliva
On Feb 3, 2021, Richard Biener wrote: > So I think we should try to match what __builtin_memcpy/memset > expansion would do here, taking advantage of extra alignment > and size knowledge. In particular, > a) if __builtin_memcpy/memset would use setmem/cpymem optabs > see if we can have v

Re: [RFC] test builtin ratio for loop distribution

2021-02-03 Thread Richard Biener via Gcc-patches
On Tue, Feb 2, 2021 at 6:14 PM Alexandre Oliva wrote: > > On Jan 28, 2021, Richard Biener wrote: > > > That would allow turning back the memset into the original loop (but > > with optimal IVs, etc.). > > Is this sort of what you had in mind? > > I haven't tested the inline expansion of memset mu

Re: [RFC] test builtin ratio for loop distribution

2021-02-02 Thread Alexandre Oliva
On Jan 28, 2021, Richard Biener wrote: > That would allow turning back the memset into the original loop (but > with optimal IVs, etc.). Is this sort of what you had in mind? I haven't tested the inline expansion of memset much yet; and that of memcpy, not at all; this really is mainly to check

Re: [RFC] test builtin ratio for loop distribution

2021-01-28 Thread Richard Biener via Gcc-patches
On Thu, Jan 28, 2021 at 6:28 AM Alexandre Oliva wrote: > > On Jan 27, 2021, Richard Biener wrote: > > > That said, rather than not transforming the loop as you do I'd > > say we want to re-inline small copies more forcefully during > > loop distribution code-gen so we turn a loop that sets > > 3

Re: [RFC] test builtin ratio for loop distribution

2021-01-27 Thread Alexandre Oliva
On Jan 27, 2021, Richard Biener wrote: > That said, rather than not transforming the loop as you do I'd > say we want to re-inline small copies more forcefully during > loop distribution code-gen so we turn a loop that sets > 3 'short int' to zero into a 'int' store and a 'short' store for exampl

Re: [RFC] test builtin ratio for loop distribution

2021-01-27 Thread Richard Biener via Gcc-patches
On Wed, Jan 27, 2021 at 2:18 PM Alexandre Oliva wrote: > > > This patch attempts to fix a libgcc codegen regression introduced in > gcc-10, as -ftree-loop-distribute-patterns was enabled at -O2. > > > The ldist pass turns even very short loops into memset calls. E.g., > the TFmode emulation calls

[RFC] test builtin ratio for loop distribution

2021-01-27 Thread Alexandre Oliva
This patch attempts to fix a libgcc codegen regression introduced in gcc-10, as -ftree-loop-distribute-patterns was enabled at -O2. The ldist pass turns even very short loops into memset calls. E.g., the TFmode emulation calls end with a loop of up to 3 iterations, to zero out trailing words,