OPYSIGN)
> + /* hypot(copysign(x, y), z) -> hypot(x, z). */
> + (simplify
> + (hypots (copysigns @0 @1) @2)
> + (hypots @0 @2))
> + /* hypot(x, copysign(y, z)) -> hypot(x, y). */
> + (simplify
> + (hypots @0 (copysigns @1 @2))
> + (hypots @0 @1
>
> /* copysign(x, CST) -> [-]abs (x). */
> (for copysigns (COPYSIGN_ALL)
>
>
>
>
>
--
Richard Biener
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)
arget/aarch64/sve/fneg-abs_3.c
> b/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_3.c
> new file mode 100644
> index
> ..1bf34328d8841de8e6b0a5458562a9f00e31c275
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_3.c
> @@ -0,0 +
On Mon, Nov 6, 2023 at 2:02 PM Maxim Blinov wrote:
>
> From: Maxim Blinov
>
> This patch is based on and intended for the
> vendors/riscv/gcc-13-with-riscv-opts branch - please apply if looks OK.
>
> Fixes the following ICEs that I'm seeing:
>
> FAIL: gcc.dg/vect/O3-pr49087.c (internal compiler
The following fixes an oversight in vect_check_scalar_mask when
the mask is external or constant. When doing BB vectorization
we need to provide a group_size, best via an overload accepting
the SLP node as argument.
When fixed we then run into the issue that we have not analyzed
alignment of the
The following simplifies LC-PHI arg population during epilog peeling,
thereby fixing the testcase in this PR.
Bootstrapped and tested on x86_64-unknown-linux-gnu, also built
SPEC CPU 2017 with and without LTO, pushed.
PR tree-optimization/111950
* tre-vect-loop-manip.cc (slpeel_du
On Mon, 6 Nov 2023, Tamar Christina wrote:
> Hi All,
>
> This patch adds initial support for early break vectorization in GCC.
> The support is added for any target that implements a vector cbranch optab,
> this includes both fully masked and non-masked targets.
>
> Depending on the operation, t
The following fixes the mask argument generation for SIMD clone
calls under either loop masking or when the actual call is not
masked but only a inbranch simd clone is available. The issue
was that we tried to directly convert the vector mask to the
call argument type but SIMD clone masks require
On Mon, 6 Nov 2023, Tamar Christina wrote:
> Hi All,
>
> The vectorizer at the moment uses a num_bb check to check for control flow.
> This rejects a number of loops with no reason. Instead this patch changes it
> to check the destination of the exits instead.
>
> This also allows early break t
>
> > gcc/ChangeLog:
> >
> > * config/aarch64/aarch64-simd.md
> > (vec_widen_subl_lo_): Removed.
> > (vec_widen_subl_hi_): Removed.
> >
> > gcc/testsuite/ChangeLog:
> >
> > * gcc.target/aarch64/vect-widen-sub.c: Removed.
>
--
Richard Biener
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)
ushed.
2021-01-22 Richard Biener
PR tree-optimization/98786
* tree-ssa-phiopt.c (factor_out_conditional_conversion): Avoid
adding new uses of abnormals. Verify we deal with a conditional
conversion.
* gcc.dg/torture/pr98786.c: New testcase.
---
gcc/test
> + *e = 0;
> + h = n[f.c + 4][0][d];
> + }
> + while (g)
> + return n[0][3][i];
> + while (1)
> + {
> + if (k)
> + {
> + j = 0;
> + if (j)
> + continue;
> + }
> + if (l)
> + break;
> + }
> +}
> + return 0;
> +}
> +
> +int
> +main ()
> +{
> + asm volatile ("" : "+g" (d), "+g" (g), "+g" (f.c));
> + asm volatile ("" : "+g" (e), "+g" (k), "+g" (l));
> + foo ();
> + return 0;
> +}
>
> Jakub
>
>
--
Richard Biener
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)
> + float a = 0.f;
> + float b = b1 * b2;
> + float c = 2.f;
> + float d = -2.f;
> + if (f1 (a) != -1.f || f1 (b) != -1.f || f1 (c) != 1.f || f1 (d) != -1.f
> + || f2 (a) != 1.f || f2 (b) != 1.f || f2 (c) != 1.f || f2 (d) != -1.f
> + || f3 (a) != -1.f || f3 (b) != -1.f || f3 (c) != -1.f || f3 (d) != 1.f
> + || f4 (a) != 1.f || f4 (b) != 1.f || f4 (c) != -1.f || f4 (d) != 1.f
> + || f5 (a) != 1.f || f5 (b) != 1.f || f5 (c) != -1.f || f5 (d) != 1.f
> + || f6 (a) != -1.f || f6 (b) != -1.f || f6 (c) != -1.f || f6 (d) != 1.f
> + || f7 (a) != 1.f || f7 (b) != 1.f || f7 (c) != 1.f || f7 (d) != -1.f
> + || f8 (a) != -1.f || f8 (b) != -1.f || f8 (c) != 1.f || f8 (d) != -1.f)
> +__builtin_abort ();
> + return 0;
> +}
>
> Jakub
>
>
--
Richard Biener
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)
, pushed.
2021-01-22 Richard Biener
PR middle-end/98773
* tree-data-ref.c (initalize_matrix_A): Revert previous
change, retaining failing on HOST_WIDE_INT_MIN CHREC_RIGHT.
* gcc.dg/torture/pr98773.c: New testcase.
---
gcc/testsuite/gcc.dg/torture/pr98773.c | 19
h
> > index e789e4f..773a2b3 100644
> > --- a/gcc/tree-ssa-loop-manip.h
> > +++ b/gcc/tree-ssa-loop-manip.h
> > @@ -55,7 +55,6 @@ extern void tree_transform_and_unroll_loop (class loop *,
> > unsigned,
> > extern void tree_unroll_loop (class loop *, unsigned,
The previous change made AVX512 mask vectors correct but disregarded
the possibility of generic (BLKmode) boolean vectors which are exposed
by the frontends already.
Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.
2021-01-22 Richard Biener
PR middle-end/98793
On January 22, 2021 8:02:28 PM GMT+01:00, Jakub Jelinek
wrote:
>On Mon, Sep 21, 2020 at 10:12:20AM +0200, Richard Biener wrote:
>> On Mon, 21 Sep 2020, Jan Hubicka wrote:
>> > these testcases now fails because they contains an invalid type
>puning
>> > that h
On January 22, 2021 3:49:41 PM GMT+01:00, Jakub Jelinek
wrote:
>Hi!
>
>When GCC is emitting .debug_line or .gnu.debuglto_.debug_line section
>by
>itself (happens either with too old or non-GNU assembler, with
>-gno-as-loc-support or with -flto) on empty translation units, it
>violates
>the DWARF
> case CFN_BUILT_IN_BCMP:
> case CFN_BUILT_IN_MEMCMP:
> - if (!host_size_t_cst_p (arg2, &s2))
> + if (!size_t_cst_p (arg2, &s2))
> return NULL_TREE;
>if (s2 == 0
> && !TREE_SIDE_EFFECTS (arg0)
> @@ -1811,7 +1809,7 @@ fold_const_call (combined_fn fn, tree ty
>return NULL_TREE;
>
> case CFN_BUILT_IN_MEMCHR:
> - if (!host_size_t_cst_p (arg2, &s2))
> + if (!size_t_cst_p (arg2, &s2))
> return NULL_TREE;
>if (s2 == 0
> && !TREE_SIDE_EFFECTS (arg0)
>
> Jakub
>
>
--
Richard Biener
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)
On Fri, 22 Jan 2021, Segher Boessenkool wrote:
> On Fri, Jan 22, 2021 at 02:47:06PM +0100, Richard Biener wrote:
> > On Thu, 21 Jan 2021, Segher Boessenkool wrote:
> > > What is holding up this patch still? Ke Wen has pinged it every month
> > > since May, and there
This simplifies vector_element_bits further, avoiding any mode
dependence and instead relying on boolean vector construction
to populate element precision accordingly.
Bootstrapped and tested on x86_64-unknown-linux-gnu (also with
AVX512 with the help of SDE), pushed.
2021-01-25 Richard Biener
$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5
>test $ac_status = 0; }; }
> then
> - gcc_cv_as_gdwarf_5_flag=yes
> + if test x$gcc_cv_readelf != x \
> + && $gcc_cv_readelf -wi conftest.o 2>&1 \
> + | grep DW_TAG_compile_un
77298476
> +0100
> @@ -0,0 +1,11 @@
> +/* PR tree-optimization/97260 */
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-tree-optimized" } */
> +/* { dg-final { scan-tree-dump "return 0;" "optimized" } } */
> +
> +int
> +foo (void)
On Tue, 26 Jan 2021, Jakub Jelinek wrote:
> On Tue, Jan 26, 2021 at 10:03:16AM +0100, Richard Biener wrote:
> > > In 4.8 and earlier we used to fold the following to 0 during GENERIC
> > > folding,
> > > but we don't do that anymore because ctor_for_folding
On Tue, 26 Jan 2021, Jakub Jelinek wrote:
> On Tue, Jan 26, 2021 at 10:55:35AM +0100, Jan Hubicka wrote:
> > > On Tue, Jan 26, 2021 at 10:03:16AM +0100, Richard Biener wrote:
> > > > > In 4.8 and earlier we used to fold the following to 0 during GENERIC
> > >
On Mon, 25 Jan 2021, Richard Sandiford wrote:
> Richard Biener writes:
> > On Fri, 22 Jan 2021, Segher Boessenkool wrote:
> >
> >> On Fri, Jan 22, 2021 at 02:47:06PM +0100, Richard Biener wrote:
> >> > On Thu, 21 Jan 2021, Segher Boessenkool wrote:
> >
On Tue, 26 Jan 2021, Kewen.Lin wrote:
> Hi Segher/Richard B./Richard S.,
>
> Many thanks for your all helps and comments on this!
>
> on 2021/1/25 下午3:56, Richard Biener wrote:
> > On Fri, 22 Jan 2021, Segher Boessenkool wrote:
> >
> >> On Fri, Jan 22, 2021
0 +1,11 @@
> +/* PR tree-optimization/97260 */
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-tree-optimized" } */
> +/* { dg-final { scan-tree-dump "return 0;" "optimized" } } */
> +
> +int
> +foo (void)
> +{
> + const char a[] = "1234";
> + return __builtin_memcmp (a, "1234", 4);
> +}
>
>
> Jakub
>
>
--
Richard Biener
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)
On Tue, 26 Jan 2021, Jakub Jelinek wrote:
> On Tue, Jan 26, 2021 at 12:16:14PM +0100, Richard Biener wrote:
> > > + /* Unless this is called during FE folding. */
> > > + if (cfun
> > > + && (cfun->curr_properties & (PROP_trees | PRO
On Tue, 26 Jan 2021, Jakub Jelinek wrote:
> On Tue, Jan 26, 2021 at 12:25:16PM +0100, Richard Biener wrote:
> > On Tue, 26 Jan 2021, Jakub Jelinek wrote:
> >
> > > On Tue, Jan 26, 2021 at 12:16:14PM +0100, Richard Biener wrote:
> > > > > + /*
This avoids dumping them as <<< ??? >>>.
Will commit as obvoious.
2021-01-26 Richard Biener
* gimple-pretty-print.c (dump_binary_rhs): Handle
VEC_WIDEN_{PLUS,MINUS}_{LO,HI}_EXPR.
---
gcc/gimple-pretty-print.c | 4
1 file changed, 4 insertions(+)
diff
This fixes VECTOR_CST element access with POLY_INT elements and
allows to produce dump files of the PR98726 testcase without
ICEing.
Bootstrapped and tested on x86_64-unknown-linux-gnu, OK?
Thanks,
Richard.
2021-01-26 Richard Biener
PR middle-end/98726
* tree.h
On Tue, 26 Jan 2021, Richard Sandiford wrote:
> Richard Biener writes:
> > This fixes VECTOR_CST element access with POLY_INT elements and
> > allows to produce dump files of the PR98726 testcase without
> > ICEing.
> >
> > Bootstrapped and tested on x86_64-unkno
the usual places we do
IL verification.
Bootstrapped and tested (with the checker enabled) on
x86_64-unknown-linux-gnu.
OK for trunk?
Thanks,
Richard.
2021-01-27 Richard Biener
* tree-ssa-coalesce.h (verify_ssa_coalescing): Declare.
* tree-ssa-coalesce.c (verify_ssa_coales
testcase, fixing the memory usage regression from old GCC.
Bootstrap and regtest running on x86_64-unknown-linux-gnu, OK?
Thanks,
Richard.
2021-01-27 Richard Biener
PR rtl-optimization/80960
* dse.c (check_mem_read_rtx): Call get_addr on the
offsetted address.
---
gcc
This avoids cases of PHI node vectorization that just causes us
to insert vector CTORs inside loops for values only required
outside of the loop.
Bootstrap and regtest running on x86_64-unknown-linux-gnu.
2021-01-27 Richard Biener
PR tree-optimization/98854
* tree-vect-slp.c
On Wed, 27 Jan 2021, Jakub Jelinek wrote:
> On Wed, Jan 27, 2021 at 03:40:38PM +0100, Richard Biener wrote:
> > The following avoids repeatedly turning VALUE RTXen into
> > sth useful and re-applying a constant offset through get_addr
> > via DSE check_mem_read_rtx. Instead
On Wed, 27 Jan 2021, Jakub Jelinek wrote:
> On Wed, Jan 27, 2021 at 04:16:22PM +0100, Richard Biener wrote:
> > I can check but all immediate first uses of mem_addr are in
> > true_dependece_1 which does x_addr = get_addr (x_addr); as the
> > first thing on it. So the
On Wed, 27 Jan 2021, Jakub Jelinek wrote:
> On Wed, Jan 27, 2021 at 03:40:38PM +0100, Richard Biener wrote:
> > The following avoids repeatedly turning VALUE RTXen into
> > sth useful and re-applying a constant offset through get_addr
> > via DSE check_mem_read_rtx. Instead
al-options "-march=x86-64" { target { i?86-*-* x86_64-*-* }
> } } */
> +
> +void bar (const char *);
> +unsigned long long x;
> +
> +void
> +foo (void)
> +{
> + int a = 1;
> + bar ("foo");
> + int b = 2;
> + __atomic_fetch_add (&x, 1, 0);
> + int c = 3;
> + __builtin_unreachable ();
> +}
>
> Jakub
>
>
--
Richard Biener
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)
from PR98144
from 6GB to less than 2GB.
Bootstrapped and tested on x86_64-unknown-linux-gnu.
OK for trunk and branches?
Thanks,
Richard.
2021-01-29 Richard Biener
PR rtl-optimization/98144
* df.h (df_mir_bb_info): Add con_visited member.
* df-problems.c (df_mir_alloc
This fixes overflow of the memory usage estimate in turn failing
to disable itself on WRF with LTO, causing a few GBs worth of
memory peak.
Bootstrap and regtest running on x86_64-unknown-linux-gnu, will apply
as obvious if that succeeds.
Thanks,
Richard.
2021-01-29 Richard Biener
This changes it from bytes to kB since its value is limited to
2147483648.
Bootstrap running on x86_64-unknown-linux-gnu, will push to trunk
(but only there).
2021-01-29 Richard Biener
* doc/invoke.texi (--param max-gcse-memory): Document unit
of size.
* gcse.c
This removes adding very expensive DF problems which we do not
use and which somehow cause 5GB of memory to leak.
Bootstrap & regtest running on x86_64-unknown-linux-gnu.
2021-01-29 Richard Biener
PR rtl-optimization/98863
* config/i386/i386-featur
On Fri, 29 Jan 2021, Jan Hubicka wrote:
> > This removes adding very expensive DF problems which we do not
> > use and which somehow cause 5GB of memory to leak.
>
> Impressive :)
> >
> > Bootstrap & regtest running on x86_64-unknown-linux-gnu.
&
by
one of the DF problems originally removed.
Richard.
> > >
> > > Impressive :)
> > > >
> > > > Bootstrap & regtest running on x86_64-unknown-linux-gnu.
> > > >
> > > > 2021-01-29 Richard Biener
> > > >
> > >
On Fri, 29 Jan 2021, Jakub Jelinek wrote:
> On Fri, Jan 29, 2021 at 04:43:49PM +0100, Richard Biener wrote:
> > 2021-01-29 Richard Biener
> >
> > PR rtl-optimization/98863
> > * config/i386/i386-features.c (remove_partial_avx_dependency):
> &g
a16bb6d216ff41d9c6a9da95c19b5c
>> Author: Richard Biener
>> Date: Fri Jan 29 16:02:36 2021 +0100
>>
>> rtl-optimization/98863 - tame i386 specific RPAD pass
>>
>> caused
>>
>> FAIL: gcc.c-torture/compile/20051216-1.c -O1 (internal compiler
>e
On January 30, 2021 10:46:17 AM GMT+01:00, Jakub Jelinek
wrote:
>On Sat, Jan 30, 2021 at 09:17:45AM +0100, Richard Biener wrote:
>> >The following patch fixes it, ok for trunk if it passes
>> >bootstrap/regtest?
>>
>> Hmm, that's odd. Who relies on defer
On January 30, 2021 11:52:20 AM GMT+01:00, Jakub Jelinek
wrote:
>On Sat, Jan 30, 2021 at 11:47:24AM +0100, Richard Biener wrote:
>> OK, so I'd prefer we simply unset the flag after processing deferred
>rescan. I clearly misread the function to do that.
>
>This works too,
This sets DF_RD_PRUNE_DEAD_DEFS like all other uses of the UD/DU
chain problems which makes the RD problem consume a lot less memory.
Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.
2021-02-01 Richard Biener
PR rtl-optimization/98863
* config/i386/i386-features.c
This fixes accounting issues with using auto_vec and auto_bitmap
for -fmem-report.
Bootstrap running on x86_64-unknown-linux-gnu, with and without
--enable-gather-detailed-mem-stats
2021-02-01 Richard Biener
* vec.h (auto_vec::auto_vec): Add memory stat parameters
and pass
On February 1, 2021 8:34:35 PM GMT+01:00, Jeff Law wrote:
>
>
>On 1/28/21 1:09 AM, Richard Biener wrote:
>> On Wed, 27 Jan 2021, Jakub Jelinek wrote:
>>
>>> On Wed, Jan 27, 2021 at 03:40:38PM +0100, Richard Biener wrote:
>>>> The following avoids re
#9 => MEM [(double[3] *)&p2]
> >>>>> # DEBUG p2$0 => D#9
> >>>>> # DEBUG D#8 => MEM [(double[3] *)&p2 + 8B]
> >>>>> # DEBUG p2$1 => D#8
> >>>>> # DEBUG D#7 => MEM [(double[3] *)&p2 + 16B]
> >>>>> # DEBUG p2$2 => D#7
> >>>>> MEM [(double[3] *)&p3] = p3$0_256(D);
> >>>>> MEM [(double[3] *)&p3 + 8B] = p3$1_258(D);
> >>>>> MEM [(double[3] *)&p3 + 16B] = p3$2_260(D);
> >>>>> p3 = .DEFERRED_INIT (p3, 2);
> >>>>> ….
> >>>>> }
> >>>>>
> >>>>> I guess that the above “MEM ….. = …” are the ones that make the
> >>>>> differences. Which phase introduced them?
> >>>>
> >>>> Looks like SRA. But you can just dump all and grep for the first
> >>>> occurrence.
> >>>
> >>> Yes, looks like that SRA is the one:
> >>>
> >>> image.cpp.035t.esra: MEM [(double[3] *)&p1] = p1$0_195(D);
> >>> image.cpp.035t.esra: MEM [(double[3] *)&p1 + 8B] = p1$1_182(D);
> >>> image.cpp.035t.esra: MEM [(double[3] *)&p1 + 16B] = p1$2_185(D);
> >>
> >> I realise no-one was suggesting otherwise, but FWIW: SRA could easily
> >> be extended to handle .DEFERRED_INIT if that's the main source of
> >> excess stack usage. A single .DEFERRED_INIT of an aggregate can
> >> be split into .DEFERRED_INITs of individual components.
> >
> > Thanks a lot for the suggestion,
> > I will study the code of SRA to see how to do this and then see whether
> > this can resolve the issue.
>
>
--
Richard Biener
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)
On Mon, 1 Feb 2021, Jakub Jelinek wrote:
> On Mon, Feb 01, 2021 at 12:54:50PM -0700, Jeff Law wrote:
> > >>> So I see no difference for stage2-gcc/*.o dse1/dse2 with/without the
> > >>> patch but counts are _extremely_ small. Statistics:
> > >>>
> > >>> 70148 dse: local deletions = 0, global de
@@ int b(int n, unsigned char *a)
>return d;
> }
>
> -/* { dg-final { scan-tree-dump "vectorized 1 loops" "vect" { xfail *-*-* } }
> } */
> +/* { dg-final { scan-tree-dump "vectorized 1 loops" "vect" { target {
> vect_unpack && { ! vect_no_bitwise } } } } } */
>
> Jakub
>
>
--
Richard Biener
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)
um_ops (res_op->code))
> {
> + if (VECTOR_TYPE_P (res_op->type)
> + && cfun
> + && (cfun->curr_properties & PROP_gimple_lvec) != 0
> + && check_gimple_lvec (res_op))
> + return false;
>
On Tue, 2 Feb 2021, Przemyslaw Wirkus wrote:
> > On 2021-01-18 7:50 a.m., Richard Biener wrote:
> > > On Mon, 18 Jan 2021, Przemyslaw Wirkus wrote:
> > >
> > >> Hi all,
> > >>
> > >> Can we backport PR97969 patch to GCC 10 and (maybe)
On Tue, 2 Feb 2021, Jakub Jelinek wrote:
> On Tue, Feb 02, 2021 at 11:06:33AM +0100, Richard Biener wrote:
> > So I fear this only covers parts of the paths simplifications can
> > end up used. Now one question is whether we want to allow
> > "invalid" intermedi
On Tue, 2 Feb 2021, Richard Sandiford wrote:
> Richard Biener writes:
> > On January 30, 2021 11:52:20 AM GMT+01:00, Jakub Jelinek
> > wrote:
> >>On Sat, Jan 30, 2021 at 11:47:24AM +0100, Richard Biener wrote:
> >>> OK, so I'd prefer we simply
This fixes various vec<> memory leaks as discovered compiling 521.wrf_r.
Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.
2021-02-02 Richard Biener
* gimple-loop-interchange.cc (prepare_data_references):
Release vectors.
* gimple-loop
>/* True if this is only suitable for SLP vectorization. */
>bool slp_vect_only_p;
> +
> + /* True if this is a pattern that can only be handled by SLP
> + vectorization. */
> + bool slp_vect_pattern_only_p;
> };
>
> /* Information about a gather/scatter
This fixes more memory leaks as discovered by building 521.wrf_r.
Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.
2021-02-03 Richard Biener
* lto-streamer.c (lto_get_section_name): Free temporary
buffer.
* tree-loop-distribution.c
d and tested on x86_64-unknown-linux-gnu, pushed.
2021-02-04 Richard Biener
PR tree-optimization/98855
* tree-vect-loop.c (vectorizable_phi): Do not cost
single-argument PHIs.
* tree-vect-slp.c (vect_bb_slp_scalar_cost): Likewise.
* tree-vect-st
oop header. Likewise stmts only
reachable from a loop exit can be treated this way.
Bootstrapped and tested on x86_64-unknown-linux-gnu and it fixes
the regression reported in the PR.
Does this look sensible and good enough for GCC 11?
Thanks,
Richard.
2021-02-05 Richard Biener
On Fri, 5 Feb 2021, Richard Sandiford wrote:
> Richard Biener writes:
> > The following attempts to account for the fact that BB vectorization
> > regions now can span multiple loop levels and that an unprofitable
> > inner loop vectorization shouldn't be offsetted by a
On Fri, 5 Feb 2021, Richard Sandiford wrote:
> Richard Biener writes:
> > On Fri, 5 Feb 2021, Richard Sandiford wrote:
> >> Richard Biener writes:
> >> > + /* First produce cost vectors sorted by loop index. */
> >> > + auto_vec >
>
On Fri, 5 Feb 2021, Richard Biener wrote:
> On Fri, 5 Feb 2021, Richard Sandiford wrote:
>
> > Richard Biener writes:
> > > On Fri, 5 Feb 2021, Richard Sandiford wrote:
> > >> Richard Biener writes:
> > >> > + /* First produce cost vector
On Fri, 5 Feb 2021, Kyrylo Tkachov wrote:
> Hi Richard,
>
> > -Original Message-
> > From: Gcc-patches On Behalf Of
> > Richard Biener
> > Sent: 01 October 2020 14:15
> > To: gcc-patches@gcc.gnu.org
> > Subject: [PATCH] tree-optimization/
On Mon, 8 Feb 2021, Kyrylo Tkachov wrote:
>
>
> > -Original Message-
> > From: Richard Biener
> > Sent: 05 February 2021 13:51
> > To: Kyrylo Tkachov
> > Cc: gcc-patches@gcc.gnu.org
> > Subject: RE: [PATCH] tree-optimization/97236 - fix bad
in progress,
will push after a non-LTO bootstrap since I used --disable-werror.
2021-02-08 Richard Biener
PR lto/96591
* tree.c (walk_tree_1): Walk VECTOR_CST elements.
* g++.dg/lto/pr96591_0.C: New testcase.
---
gcc/testsuite/g++.dg/lto/pr96591_0.C | 45
11?
The issue is that the preprocessed source does not reproduce the issue and
a mingw development environment is not easily accessible (to me at least).
So unless you can reproduce this in a standard linux environment and can
provide a testcase I don't see a way to get this bug forward.
Richar
This works around a SLP graph partitioning or cost collecting issue
by being more forgiving in vect_bb_vectorization_profitable_p.
Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.
2021-02-09 Richard Biener
PR tree-optimization/99017
* tree-vect-slp.c
zrng[1]);
> gcc_checking_assert (strlen (s0) + strlen (s1)
> < sizeof sizstr - 4);
> sprintf (sizstr, "[%s, %s]", s0, s1);
> + free (s1);
> }
> + free (s0);
> }
>el
ement type to unsigned int.
This was changed (in accident?) in 0263463dd114 and the following
just reverts that bit.
Bootstrap / regtest pending on x86_64-unknown-linux-gnu, OK?
Thanks,
Richard.
2021-02-09 Richard Biener
* sparseset.h (SPARSESET_ELT_BITS): R
quot;vpsraq\[ \t]\+\\\$3, %xmm\[0-9]\+,
> %xmm\[0-9]\+" 0 } } */
> /* { dg-final { scan-assembler-times "vpsraw\[ \t]\+\\\$3, %xmm\[0-9]\+,
> %xmm\[0-9]\+" 1 } } */
> /* { dg-final { scan-assembler-times "vpsrld\[ \t]\+\\\$5, %xmm\[0-9]\+,
> %xmm\[0-9]\+&
x86_64-unknown-linux-gnu, will push soon.
2021-02-09 Richard Biener
PR tree-optimization/98863
* tree-ssa-sccvn.h (vn_avail::next_undo): Add.
* tree-ssa-sccvn.c (last_pushed_avail): New global.
(rpo_elim::eliminate_push_avail): Chain pushed avails
On Tue, 9 Feb 2021, Jakub Jelinek wrote:
> On Tue, Feb 09, 2021 at 12:52:55PM +0100, Richard Biener wrote:
> > Yeah, it does look useful in the end. Note that you might want
> > to adjust ix86_add_stmt_cost (or ix86_shift_rotate_cost, that is)
> > to reflect the complex expa
mingw/windows
environment?
Richard.
>
> Qing
>
> > On Feb 9, 2021, at 2:18 AM, Richard Biener wrote:
> >
> > On Mon, 8 Feb 2021, Qing Zhao wrote:
> >
> >> Hi,
> >>
> >> The bug https://gcc.gnu.org/bugzilla/show_bug.cgi?id=9639
and tested on x86_64-unknown-linux-gnu, pushed.
2021-02-10 Richard Biener
PR tree-optimization/99024
* tree-vect-loop.c (_loop_vec_info::~_loop_vec_info): Only
clear loop->aux if it is associated with the destroyed loop_vinfo.
---
gcc/tree-vect-loop.c | 6 +-
1
This makes sure to release the vec<> of callees.
Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.
2021-02-10 Richard Biener
PR ipa/99029
* ipa-pure-const.c (propagate_malloc): Use an auto_vec<>
for callees.
---
gcc/ipa-pure-const.c | 2 +-
1
The optimize pragma/attribute parsing calls decode_cmdline_options_to_array
but doesn't free the array. The following fixes that.
Bootstrapped and tested on x86_64-unknown-linux-gnu, OK?
Thanks,
Richard.
2021-02-10 Richard Biener
gcc/c-family/
* c-common.c (parse_optimize_op
This fixes a leak of the vector retured by find_partition_fixes
by turning it into an auto_vec.
Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.
2021-02-10 Richard Biener
PR rtl-optimization/99054
* cfgrtl.c (rtl-optimization/99054): Return an auto_vec
On Wed, 10 Feb 2021, Jakub Jelinek wrote:
> On Wed, Feb 10, 2021 at 11:30:42AM +0100, Richard Biener wrote:
> > The optimize pragma/attribute parsing calls decode_cmdline_options_to_array
> > but doesn't free the array. The following fixes that.
> >
> > Bootstrap
?
Thanks,
Richard.
2021-02-10 Richard Biener
PR ipa/97346
* ipa-reference.c (propagate): Always free
reference_vars_to_consider.
(ipa_reference_write_optimization_summary): Free
reference_vars_to_consider before re-allocating it
xt/weak6.C 2021-02-10 13:26:46.439276213 +0100
> @@ -0,0 +1,8 @@
> +// PR c++/99035
> +// { dg-do compile }
> +// { dg-require-weak "" }
> +// { dg-options "-fsyntax-only" }
> +
> +extern void * foo (void);
> +void * foo (void) { return (void *)foo; }
>
This is a regression on trunk and the GCC 10 branch btw.
Bootstrapped and tested on x86_64-unknown-linux-gnu.
Any opinions?
Thanks,
Richard.
2021-02-11 Richard Biener
PR tree-optimization/38474
* params.opt (-param=max-store-chains-to-track=): New param.
(-param=max-store
own alias queries.
Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.
2021-02-12 Richard Biener
PR middle-end/38474
* ipa-fnsummary.c (unmodified_parm_1): Only walk when
fbi->aa_walk_budget is bigger than zero. Update
fbi->aa_walk_
On February 13, 2021 9:58:58 AM GMT+01:00, Jakub Jelinek
wrote:
>Hi!
>
>As mentioned in the PR, we have 5 split passes (+ splitting during
>final).
>split1 is before RA and is unconditional,
>split2 is after RA and is gated on optimize > 0,
>split3 is before sched2 and is gated on
>defined(INSN_S
On February 13, 2021 4:07:03 PM GMT+01:00, Jakub Jelinek
wrote:
>On Sat, Feb 13, 2021 at 02:54:38PM +0100, Richard Biener wrote:
>> Ok. But if required splitting is an IL property maybe we can see sth
>like
>> RTL_split_insns, clear it from passes like selsched and
71LL % (1 << b);
> +}
> +
> +/* { dg-final { scan-tree-dump-not " % " "optimized" } } */
> --- gcc/testsuite/gcc.c-torture/execute/pr99079.c.jj 2021-02-12
> 19:29:25.021196283 +0100
> +++ gcc/testsuite/gcc.c-torture/execute/pr99079.c 2021-02-12
> 19:16:31.761892858 +0100
> @@ -0,0 +1,18 @@
> +/* PR tree-optimization/99079 */
> +
> +__attribute__((noipa)) unsigned long long
> +foo (int x)
> +{
> + unsigned long long s = 1 << x;
> + return 4897637220ULL % s;
> +}
> +
> +int
> +main ()
> +{
> + if (__SIZEOF_INT__ * __CHAR_BIT__ != 32)
> +return 0;
> + if (foo (31) != 4897637220ULL)
> +__builtin_abort ();
> + return 0;
> +}
>
> Jakub
>
>
--
Richard Biener
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)
on them. I guess no backend can rely on anything else than
df-scan which should be up-to-date or df_live/lr, and without extra
dances it would use df_get_live_out to choose from the two.
Just to get an idea whether it's worth doing the extra df_analyze.
Since we have possibly 5 split passes it
-time for the full testcase in
PR38474 (which then still takes 965s to compile at -O2).
Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.
2021-02-16 Richard Biener
PR tree-optimization/38474
* tree-ssa-structalias.c (variable_info::address_taken): New.
// { dg-do compile }
> +// { dg-options "-O2 -Warray-bounds" }
> +
> +typedef int A __attribute__((aligned (64)));
> +void foo (int *);
> +
> +void
> +bar (void)
> +{
> + A b; // { dg-message "while referencing" }
> + int *p = &b;
> + int *x = (p - 1); // { dg-warning "outside array bounds" }
> + foo (x);
> +}
>
>
> Jakub
>
>
--
Richard Biener
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)
GERLAKE = PTA_ICELAKE_CLIENT | PTA_MOVDIRI
>| PTA_MOVDIR64B | PTA_CLWB | PTA_AVX512VP2INTERSECT | PTA_KL | PTA_WIDEKL;
> -const wide_int_bitmask PTA_SAPPHIRERAPIDS = PTA_COOPERLAKE | PTA_MOVDIRI
> +constexpr wide_int_bitmask PTA_SAPPHIRERAPIDS = PTA_COOPERLAKE | PTA_MOVDIRI
>| PTA_MOVDIR64B | PTA_AVX512VP2INTERSECT | PTA_ENQCMD | PTA_CLDEMOTE
>| PTA_PTWRITE | PTA_WAITPKG | PTA_SERIALIZE | PTA_TSXLDTRK | PTA_AMX_TILE
>| PTA_AMX_INT8 | PTA_AMX_BF16 | PTA_UINTR | PTA_AVXVNNI;
> -const wide_int_bitmask PTA_ALDERLAKE = PTA_SKYLAKE | PTA_CLDEMOTE |
> PTA_PTWRITE
> - | PTA_WAITPKG | PTA_SERIALIZE | PTA_HRESET | PTA_KL | PTA_WIDEKL |
> PTA_AVXVNNI;
> -const wide_int_bitmask PTA_KNL = PTA_BROADWELL | PTA_AVX512PF | PTA_AVX512ER
> - | PTA_AVX512F | PTA_AVX512CD | PTA_PREFETCHWT1;
> -const wide_int_bitmask PTA_BONNELL = PTA_CORE2 | PTA_MOVBE;
> -const wide_int_bitmask PTA_SILVERMONT = PTA_WESTMERE | PTA_MOVBE | PTA_RDRND
> - | PTA_PRFCHW;
> -const wide_int_bitmask PTA_GOLDMONT = PTA_SILVERMONT | PTA_AES | PTA_SHA |
> PTA_XSAVE
> - | PTA_RDSEED | PTA_XSAVEC | PTA_XSAVES | PTA_CLFLUSHOPT | PTA_XSAVEOPT
> - | PTA_FSGSBASE;
> -const wide_int_bitmask PTA_GOLDMONT_PLUS = PTA_GOLDMONT | PTA_RDPID
> +constexpr wide_int_bitmask PTA_ALDERLAKE = PTA_SKYLAKE | PTA_CLDEMOTE
> + | PTA_PTWRITE | PTA_WAITPKG | PTA_SERIALIZE | PTA_HRESET | PTA_KL
> + | PTA_WIDEKL | PTA_AVXVNNI;
> +constexpr wide_int_bitmask PTA_KNL = PTA_BROADWELL | PTA_AVX512PF
> + | PTA_AVX512ER | PTA_AVX512F | PTA_AVX512CD | PTA_PREFETCHWT1;
> +constexpr wide_int_bitmask PTA_BONNELL = PTA_CORE2 | PTA_MOVBE;
> +constexpr wide_int_bitmask PTA_SILVERMONT = PTA_WESTMERE | PTA_MOVBE
> + | PTA_RDRND | PTA_PRFCHW;
> +constexpr wide_int_bitmask PTA_GOLDMONT = PTA_SILVERMONT | PTA_AES | PTA_SHA
> + | PTA_XSAVE | PTA_RDSEED | PTA_XSAVEC | PTA_XSAVES | PTA_CLFLUSHOPT
> + | PTA_XSAVEOPT | PTA_FSGSBASE;
> +constexpr wide_int_bitmask PTA_GOLDMONT_PLUS = PTA_GOLDMONT | PTA_RDPID
>| PTA_SGX | PTA_PTWRITE;
> -const wide_int_bitmask PTA_TREMONT = PTA_GOLDMONT_PLUS | PTA_CLWB
> +constexpr wide_int_bitmask PTA_TREMONT = PTA_GOLDMONT_PLUS | PTA_CLWB
>| PTA_GFNI | PTA_MOVDIRI | PTA_MOVDIR64B | PTA_CLDEMOTE | PTA_WAITPKG;
> -const wide_int_bitmask PTA_KNM = PTA_KNL | PTA_AVX5124VNNIW
> +constexpr wide_int_bitmask PTA_KNM = PTA_KNL | PTA_AVX5124VNNIW
>| PTA_AVX5124FMAPS | PTA_AVX512VPOPCNTDQ;
>
> #ifndef GENERATOR_FILE
>
> Jakub
>
>
--
Richard Biener
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)
tested on x86_64-unknown-linux-gnu, does it
look sane?
Thanks,
Richard.
2021-02-18 Richard Biener
PR middle-end/99122
* ipa-fnsummary.c (analyze_function_body): Set
CIF_FUNCTION_NOT_INLINABLE for VLA parameter calls.
* tree-inline.c (insert_init_debug_bind): Pass
On Thu, 18 Feb 2021, Jakub Jelinek wrote:
> On Thu, Feb 18, 2021 at 01:37:29PM +0100, Richard Biener wrote:
> > The following instructs IPA not to inline calls with VLA parameters
> > and adjusts inlining not to create invalid view-converted VLA
> > parameters on mismatch and
id);
> +struct S { ~S (); };
> +
> +static inline void
> +__attribute__((always_inline))
> +bar (int d)
> +{
> + S s;
> + while (d)
> +foo ();
> +}
> +
> +void
> +baz (void)
> +{
> + bar (2);
> + __builtin_setjmp (b);
> +}
>
> Jakub
>
>
--
Richard Biener
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)
This avoids declaring a function with VLA arguments or return values
as inlineable. IPA CP still ICEs, so the testcase has that disabled.
Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.
2021-02-19 Richard Biener
PR middle-end/99122
* tree-inline.c
> ea8a97b01c6371791ac66de3e1dabfedee69cb67..65c2ff867ab41ea70367087dc26fb6eea1375ffb
> 100644
> --- a/gcc/tree-vect-slp.c
> +++ b/gcc/tree-vect-slp.c
> @@ -146,6 +146,16 @@ vect_free_slp_tree (slp_tree node)
> if (child)
>vect_free_slp_tree (child);
>
> + /* If the node defines any SLP only patterns then those patterns are no
> + longer valid and should be removed. */
> + stmt_vec_info rep_stmt_info = SLP_TREE_REPRESENTATIVE (node);
> + if (rep_stmt_info && STMT_VINFO_SLP_VECT_ONLY_PATTERN (rep_stmt_info))
> +{
> + stmt_vec_info stmt_info = vect_orig_stmt (rep_stmt_info);
> + //STMT_VINFO_IN_PATTERN_P (stmt_info) = false;
> + //STMT_SLP_TYPE (stmt_info) = STMT_SLP_TYPE (rep_stmt_info);
> +}
> +
>delete node;
> }
>
> diff --git a/gcc/tree-vectorizer.c b/gcc/tree-vectorizer.c
> index
> 5b45df3a4e00266b7530eb4da6985f0d940cb05b..63ba594f2276850a00fc372072d98326891f19e6
> 100644
> --- a/gcc/tree-vectorizer.c
> +++ b/gcc/tree-vectorizer.c
> @@ -695,6 +695,7 @@ vec_info::new_stmt_vec_info (gimple *stmt)
>STMT_VINFO_REDUC_FN (res) = IFN_LAST;
>STMT_VINFO_REDUC_IDX (res) = -1;
>STMT_VINFO_SLP_VECT_ONLY (res) = false;
> + STMT_VINFO_SLP_VECT_ONLY_PATTERN (res) = false;
>STMT_VINFO_VEC_STMTS (res) = vNULL;
>
>if (is_a (this)
>
>
>
--
Richard Biener
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)
This adds a missing accumulation to ret.
Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.
2021-02-22 Richard Biener
PR tree-optimization/99165
* gimple-ssa-store-merging.c (pass_store_merging::process_store):
Accumulate changed to ret.
* g++.dg
501 - 600 of 25332 matches
Mail list logo