On May 4, 2021, Prathamesh Kulkarni wrote:
> It looks like constfun's prototype had a typo with missing 2nd param for void
> *.
Ugh, the patch for https://gcc.gnu.org/PR90773 added the param the day
after I retested the patch, and I did not give it yet another spin
before checking it in :-(
>
On Tue, 4 May 2021 at 07:30, Alexandre Oliva wrote:
>
> On May 3, 2021, Richard Biener wrote:
>
> > On Fri, Apr 30, 2021 at 4:42 PM Jeff Law wrote:
> >>
> >>
> >> On 4/28/2021 10:26 PM, Alexandre Oliva wrote:
> >> > On Feb 22, 2021, Richard Biener wrote:
> >> >
> >> >> On Fri, Feb 19, 2021 at
On May 3, 2021, Richard Biener wrote:
> On Fri, Apr 30, 2021 at 4:42 PM Jeff Law wrote:
>>
>>
>> On 4/28/2021 10:26 PM, Alexandre Oliva wrote:
>> > On Feb 22, 2021, Richard Biener wrote:
>> >
>> >> On Fri, Feb 19, 2021 at 9:08 AM Alexandre Oliva wrote:
>> >>> Here's an improved version of t
On Fri, Apr 30, 2021 at 4:42 PM Jeff Law wrote:
>
>
> On 4/28/2021 10:26 PM, Alexandre Oliva wrote:
> > On Feb 22, 2021, Richard Biener wrote:
> >
> >> On Fri, Feb 19, 2021 at 9:08 AM Alexandre Oliva wrote:
> >>> Here's an improved version of the patch. Regstrapped on
> >>> x86_64-linux-gnu, wi
On 4/28/2021 10:26 PM, Alexandre Oliva wrote:
On Feb 22, 2021, Richard Biener wrote:
On Fri, Feb 19, 2021 at 9:08 AM Alexandre Oliva wrote:
Here's an improved version of the patch. Regstrapped on
x86_64-linux-gnu, with and without a patchlet that moved multi-pieces
ahead of setmem, and al
On Feb 22, 2021, Richard Biener wrote:
> On Fri, Feb 19, 2021 at 9:08 AM Alexandre Oliva wrote:
>>
>> Here's an improved version of the patch. Regstrapped on
>> x86_64-linux-gnu, with and without a patchlet that moved multi-pieces
>> ahead of setmem, and also tested with riscv32-elf.
>>
>> Is
On Fri, Feb 19, 2021 at 9:08 AM Alexandre Oliva wrote:
>
> Here's an improved version of the patch. Regstrapped on
> x86_64-linux-gnu, with and without a patchlet that moved multi-pieces
> ahead of setmem, and also tested with riscv32-elf.
>
> Is it ok to install? Or should it wait for stage1?
Here's an improved version of the patch. Regstrapped on
x86_64-linux-gnu, with and without a patchlet that moved multi-pieces
ahead of setmem, and also tested with riscv32-elf.
Is it ok to install? Or should it wait for stage1?
[PR94092] introduce try store by multiple pieces
From: Alexandre
On Tue, Feb 16, 2021 at 11:48 AM Alexandre Oliva wrote:
>
> On Feb 16, 2021, Alexandre Oliva wrote:
>
> >> So I wonder whether we should instead re-run CCP after loop opts which
> >> computes nonzero bits as well instead of the above "hack".
>
> That works. It takes care of both the dest alignme
On Feb 16, 2021, Alexandre Oliva wrote:
>> So I wonder whether we should instead re-run CCP after loop opts which
>> computes nonzero bits as well instead of the above "hack".
That works. It takes care of both the dest alignment and the len ctz.
Explicitly masking out the len tz from nonzero b
On Feb 12, 2021, Richard Biener wrote:
>> + if (TREE_CODE (mem) == SSA_NAME)
>> +if (ptr_info_def *pi = get_ptr_info (mem))
>> + {
>> + unsigned al = get_pointer_alignment (builtin->dst_base);
>> + if (al > pi->align || pi->misalign)
> We still might prefer pi->align == 64
On Thu, Feb 11, 2021 at 11:19 AM Alexandre Oliva wrote:
>
> On Feb 4, 2021, Alexandre Oliva wrote:
>
> > On Feb 4, 2021, Richard Biener wrote:
> >>> > b) if expansion would use BY_PIECES then expand to an unrolled loop
> >>>
> >>> Why would that be better than keeping the constant-length mems
On Feb 11, 2021, Alexandre Oliva wrote:
> How does this look?
> for gcc/ChangeLog
> PR tree-optimization/94092
> * builtins.c (try_store_by_multiple_pieces): New.
> (expand_builtin_memset_args): Use it. If target_char_cast
> fails, proceed as for non-constant val. Pa
On Feb 4, 2021, Alexandre Oliva wrote:
> On Feb 4, 2021, Richard Biener wrote:
>>> > b) if expansion would use BY_PIECES then expand to an unrolled loop
>>>
>>> Why would that be better than keeping the constant-length memset call,
>>> that would be turned into an unrolled loop during expand
On Feb 4, 2021, Jim Wilson wrote:
> FYI we have a bug report for this for a coremark regression which sounds
> like the same problem.
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94092
Indeed, thanks!
--
Alexandre Oliva, happy hacker https://FSFLA.org/blogs/lxo/
Free Software Activ
On Thu, Feb 4, 2021 at 11:18 PM Alexandre Oliva wrote:
>
> On Feb 4, 2021, Richard Biener wrote:
>
> >> > b) if expansion would use BY_PIECES then expand to an unrolled loop
> >>
> >> Why would that be better than keeping the constant-length memset call,
> >> that would be turned into an unroll
On Wed, Jan 27, 2021 at 4:40 AM Alexandre Oliva wrote:
> This patch attempts to fix a libgcc codegen regression introduced in
> gcc-10, as -ftree-loop-distribute-patterns was enabled at -O2.
>
> RISC-V doesn't have any setmemM pattern, so the loops above end up
> "optimized" into memset calls, in
On Feb 4, 2021, Richard Biener wrote:
>> > b) if expansion would use BY_PIECES then expand to an unrolled loop
>>
>> Why would that be better than keeping the constant-length memset call,
>> that would be turned into an unrolled loop during expand?
> Well, because of the possibly lost ctz and
On Wed, Feb 3, 2021 at 4:11 PM Alexandre Oliva wrote:
>
> On Feb 3, 2021, Richard Biener wrote:
>
> > So I think we should try to match what __builtin_memcpy/memset
> > expansion would do here, taking advantage of extra alignment
> > and size knowledge. In particular,
>
> > a) if __builtin_mem
On Feb 3, 2021, Richard Biener wrote:
> So I think we should try to match what __builtin_memcpy/memset
> expansion would do here, taking advantage of extra alignment
> and size knowledge. In particular,
> a) if __builtin_memcpy/memset would use setmem/cpymem optabs
> see if we can have v
On Tue, Feb 2, 2021 at 6:14 PM Alexandre Oliva wrote:
>
> On Jan 28, 2021, Richard Biener wrote:
>
> > That would allow turning back the memset into the original loop (but
> > with optimal IVs, etc.).
>
> Is this sort of what you had in mind?
>
> I haven't tested the inline expansion of memset mu
On Jan 28, 2021, Richard Biener wrote:
> That would allow turning back the memset into the original loop (but
> with optimal IVs, etc.).
Is this sort of what you had in mind?
I haven't tested the inline expansion of memset much yet; and that of
memcpy, not at all; this really is mainly to check
On Thu, Jan 28, 2021 at 6:28 AM Alexandre Oliva wrote:
>
> On Jan 27, 2021, Richard Biener wrote:
>
> > That said, rather than not transforming the loop as you do I'd
> > say we want to re-inline small copies more forcefully during
> > loop distribution code-gen so we turn a loop that sets
> > 3
On Jan 27, 2021, Richard Biener wrote:
> That said, rather than not transforming the loop as you do I'd
> say we want to re-inline small copies more forcefully during
> loop distribution code-gen so we turn a loop that sets
> 3 'short int' to zero into a 'int' store and a 'short' store for exampl
On Wed, Jan 27, 2021 at 2:18 PM Alexandre Oliva wrote:
>
>
> This patch attempts to fix a libgcc codegen regression introduced in
> gcc-10, as -ftree-loop-distribute-patterns was enabled at -O2.
>
>
> The ldist pass turns even very short loops into memset calls. E.g.,
> the TFmode emulation calls
This patch attempts to fix a libgcc codegen regression introduced in
gcc-10, as -ftree-loop-distribute-patterns was enabled at -O2.
The ldist pass turns even very short loops into memset calls. E.g.,
the TFmode emulation calls end with a loop of up to 3 iterations, to
zero out trailing words,
26 matches
Mail list logo