> Do you know what of the three changes (preferring reps/stosb,
> CLEAR_RATIO and algorithm choice changes) cause the two speedups
> on eebmc?
A extracted testcase from nnet_test in https://godbolt.org/z/c8KdsohTP
This loop is transformed to builtin_memcpy and builtin_memset with size 280.
Curre
On Sat, Apr 03, 2021 at 01:53:16AM -0400, Jason Merrill via Gcc-patches wrote:
> We were copying attributes from the template to the instantiation without
> considering that they might be dependent. To make sure that the new parms
> have the appropriate properties for the code pattern, let's just
> > Do you know what of the three changes (preferring reps/stosb,
> > CLEAR_RATIO and algorithm choice changes) cause the two speedups
> > on eebmc?
>
> A extracted testcase from nnet_test in https://godbolt.org/z/c8KdsohTP
>
> This loop is transformed to builtin_memcpy and builtin_memset with si
As noted in the PR, we were no longer using ST3 for the testcase and
instead stored each lane individually. This is because we'd split
the store group during SLP and couldn't recover when SLP failed.
However, we seem to get better code with ST3 and ST4 even if
SLP would have succeeded, such as fo
Many of the gcc.target/sve/slp-perm*.c tests started failing
after the introduction of separate SLP permute nodes.
This patch adds variable-length support using a similar
technique to vect_transform_slp_perm_load.
As there, the idea is to detect when every permute mask vector
is the same and can b
Since SLP graph partitioning works on scalar stmts (because it's done
for costing) we have to make sure to visit permute nodes multiple
times since they will not pull partitions together.
Bootstrapped / tested on x86_64-unknown-linux-gnu, pushed.
2021-04-06 Richard Biener
PR tree-opti
On Thu, Apr 01, 2021 at 02:16:55PM +0100, Alex Coplan via Gcc-patches wrote:
> FYI, I'm seeing the new test failing on aarch64:
>
> PASS: gcc.dg/pr96573.c (test for excess errors)
> FAIL: gcc.dg/pr96573.c scan-tree-dump optimized "__builtin_bswap"
The vectorizer in the aarch64 case manages to emi
On Tue, 6 Apr 2021, Jakub Jelinek wrote:
> On Thu, Apr 01, 2021 at 02:16:55PM +0100, Alex Coplan via Gcc-patches wrote:
> > FYI, I'm seeing the new test failing on aarch64:
> >
> > PASS: gcc.dg/pr96573.c (test for excess errors)
> > FAIL: gcc.dg/pr96573.c scan-tree-dump optimized "__builtin_bswap
On Tue, Apr 6, 2021 at 12:03 PM Richard Sandiford via Gcc-patches
wrote:
>
> As noted in the PR, we were no longer using ST3 for the testcase and
> instead stored each lane individually. This is because we'd split
> the store group during SLP and couldn't recover when SLP failed.
>
> However, we
On Tue, Apr 6, 2021 at 12:05 PM Richard Sandiford via Gcc-patches
wrote:
>
> Many of the gcc.target/sve/slp-perm*.c tests started failing
> after the introduction of separate SLP permute nodes.
> This patch adds variable-length support using a similar
> technique to vect_transform_slp_perm_load.
>
ping?
On Mon, 29 Mar 2021 at 11:01, Christophe Lyon
wrote:
>
> The previous change to this testcase missed the fact that the data may
> be accessed via an anchor, depending on the optimization level,
> leading to false failures.
>
> This patch restricts matching to upper16:lower16 followed by
> n
The va_arg scans are just too brittle. Let's not be that picky. We have
other tested builtins that are less brittle now anyway.
gcc/testsuite/
* g++.dg/modules/builtin-3_a.C: Remove dump scans.
* g++.dg/modules/builtin-3_b.C: Remove dump scans.
--
Nathan Sidwell
diff
This adds a relevancy check before trying to set the vector def of
a backedge in an unvectorized PHI.
Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.
2021-04-06 Richard Biener
PR tree-optimization/99880
* tree-vect-loop.c (maybe_set_vectorized_backedge_value): Onl
Some attributes like function_return, nocf_check and others are listed as
options
for target attribute. That's not correct and it's fixed in the following patch.
Ready to be installed?
Thanks,
Martin
gcc/ChangeLog:
* doc/extend.texi: Move non-target attributes on the top level.
---
gcc
gcc/ChangeLog:
* doc/invoke.texi: Document minimum and maximum value of the
argument for both supported compression algorithms.
---
gcc/doc/invoke.texi | 10 ++
1 file changed, 6 insertions(+), 4 deletions(-)
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index f4
On Tue, Apr 6, 2021 at 2:51 AM Jan Hubicka wrote:
>
> > > Do you know what of the three changes (preferring reps/stosb,
> > > CLEAR_RATIO and algorithm choice changes) cause the two speedups
> > > on eebmc?
> >
> > A extracted testcase from nnet_test in https://godbolt.org/z/c8KdsohTP
> >
> > This
Eric Botcazou writes:
>> It looks like the latter - I've seen no attempt by the original authors to
>> make the feature work on more targets than they cared for.
>
> On the other hand, if you hide the failures, there is essentially zero chance
> that architecture maintainers pick up the pieces (I
Fix the following warning:
insn-automata.c: In function ‘int maximal_insn_latency(rtx_insn*)’:
insn-automata.c:679:37: warning: array subscript -1 is below array bounds of
‘const unsigned char [19]’ [-Warray-bounds]
679 | return default_latencies[insn_code];
| ~~
A change in Doxygen 1.8.16 means that "// @}" is no longer recognized by
Doxygen, so doesn't close a @{ group. A "///" comment needs to be used.
libstdc++-v3/ChangeLog:
* include/bits/atomic_base.h: Fix doxygen group close.
* include/bits/basic_ios.h: Likewise.
* include/b
libstdc++-v3/ChangeLog:
* include/bits/alloc_traits.h: Use markdown for code font.
* include/bits/basic_string.h: Fix @param names.
* include/bits/max_size_type.h: Remove period after @file.
* include/bits/regex.h: Fix duplicate @retval names, and rename.
*
libstdc++-v3/ChangeLog:
* include/bits/move.h (forward): Change static_assert message
to be unambiguous about what must be true.
* testsuite/20_util/forward/c_neg.cc: Adjust dg-error.
* testsuite/20_util/forward/f_neg.cc: Likewise.
Tested powerpc64le-linux. Committ
Add [[nodiscard]] to functions that are effectively just a static_cast,
as per P2351. Also add it to std::addressof.
libstdc++-v3/ChangeLog:
* include/bits/move.h (forward, move, move_if_noexcept)
(addressof): Add _GLIBCXX_NODISCARD.
* include/bits/ranges_cmp.h (identity::
On 06/04/21 16:54 +0100, Jonathan Wakely wrote:
https://godbolt.org/z/hTsT96
A change in Doxygen 1.8.16 means that "// @}" is no longer recognized by
Doxygen, so doesn't close a @{ group. A "///" comment needs to be used.
libstdc++-v3/ChangeLog:
* include/bits/atomic_bas
Hi Tobias,
I believe that the attached fixes the problems that you found with
gfc_find_and_cut_at_last_class_ref.
I will test:
type1%type%array_class2 → NULL is returned (why?)
class1%type%array_class2 → ts = class1 but array2_class is used later on
(ups!)
class1%...%scalar_class2 → ts
Pinging this..
> -Original Message-
> From: Richard Wai
> Sent: March 16, 2021 2:19 PM
> To: 'gcc-patches@gcc.gnu.org'
> Cc: 'Arnaud Charlet' ; 'Bob Duff'
>
> Subject: RE: [PATCH] Ada: hashed container Cursor type predefined equality
> non-conformance
>
> Just a note that I do not have
Hi,
This patch fixes a missing call to va_end in getMatchError in the
front-end, merged from upstream dmd d16195406.
Bootstrapped and regression tested on x86_64-linux-gnu/-m32/-mx32 and
committed to mainline.
Regards,
Iain.
---
gcc/d/ChangeLog:
PR d/99917
* dmd/MERGE: Merge up
Hi,
This patch increments gaggedWarnings count if a warning or deprecation
message was suppressed. Used by the front-end to catch potential errors
in code that is being compiled in a speculative context.
Bootstrapped and regression tested on x86_64-linux-gnu/-m32/-mx32 and
committed to mainline.
Hi,
This patch refactors some code in the code generator to use the
Array::find method to get the index of an element, instead of looping
over the array ourselves.
Bootstrapped and regression tested on x86_64-linux-gnu/-m32/-mx32 and
committed to mainline.
Regards,
Iain.
---
gcc/d/ChangeLog:
Hi,
This patch merges the D front-end implementation with upstream dmd
5cc71ff83, and the Phobos standard library with druntime 1134b710.
D front-end changes:
- Fix ICEs that occurred when using opaque enums.
- Update `pragma(printf)' checking code to work on 16-bit targets.
Phobos change:
Hi,
This patch adds support for demangling function literals as template
value parameters, as well as adding the new bottom type `typeof(*null)'.
Null types were incorrectly being demangled as `none', this has been
fixed to be `typeof(null)'.
Bootstrapped and regression tested on x86_64-linux-gnu
C++17 makes constexpr static data members implicitly inline variables. In
C++14, a subsequent out-of-class declaration is the definition. We want to
continue emitting a symbol for such a declaration in C++17 mode, for ABI
compatibility with C++14 code that wants to refer to it.
Normally I'd dist
We were deferring access checks while parsing B{}, didn't adjust that
when we went to instantiate the default member initializer for B::c,
deferred access checking for C::C, and then checked it after parsing
B{}, back in the main() context which has no access. We need to do the
access checks in th
As Jens says in the PR, we handle this correctly.
Tested x86_64-pc-linux-gnu, applying to trunk.
gcc/testsuite/ChangeLog:
PR c++/52202
* g++.dg/cpp0x/rv-life.C: New test.
---
gcc/testsuite/g++.dg/cpp0x/rv-life.C | 12
1 file changed, 12 insertions(+)
create mode 10
print_rtl will dump the rtx_insn from current until LAST. But it is only
useful to see the particular insn that called by print_rtx_insn_vec,
Let's call print_rtl_single to display that insn in the gcse and store-motion
pass dump.
2021-04-07 Xionghu Luo
gcc/ChangeLog:
* fold-const.c
On Wed, Apr 7, 2021 at 7:42 AM Xionghu Luo wrote:
>
> print_rtl will dump the rtx_insn from current until LAST. But it is only
> useful to see the particular insn that called by print_rtx_insn_vec,
> Let's call print_rtl_single to display that insn in the gcse and store-motion
> pass dump.
Can y
35 matches
Mail list logo