While tidying the prototype patch I've done for the reduced testcase
in PR111591 and in that process trying to produce a testcase that
is miscompiled by stack slot coalescing and the TBAA info that
remains un-altered I've realized we do not need to adjust TBAA info.
The following documents this in
On Tue, 12 Dec 2023, Peter Bergner wrote:
> On 12/12/23 8:36 PM, Jason Merrill wrote:
> > This test is failing for me below C++17, I think you need
> >
> > // { dg-do compile { target c++17 } }
> > or
> > // { dg-require-effective-target c++17 }
>
> Sorry about that. Should we do the above or s
vpbroadcastd/vpbroadcastq is avaiable under TARGET_AVX2, but
vec_dup{v4di,v8si} pattern is avaiable under AVX with memory operand.
And it will cause LRA/Reload to generate spill and reload if we put
constant in register.
Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
Ready push to trunk
On Wed, 2023-12-13 at 14:32 +0800, Jiahao Xu wrote:
>
> 在 2023/12/13 下午2:21, Xi Ruoyao 写道:
> > On Wed, 2023-12-13 at 14:17 +0800, Jiahao Xu wrote:
> > > This test was extracted from the hot functions of 526.blender_r. Setting
> > > LOGICAL_OP_NON_SHORT_CIRCUIT to 0 resulted in a 26% decrease in dy
These two tests depend on -mabi.
Other toolchain configs would report:
fatal error: gnu/stubs-ilp32.h: No such file or directory
gcc/testsuite/ChangeLog:
* gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul1-7.c: Fix abi issue
* gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul2-4.c: Dit
在 2023/12/13 下午2:21, Xi Ruoyao 写道:
On Wed, 2023-12-13 at 14:17 +0800, Jiahao Xu wrote:
This test was extracted from the hot functions of 526.blender_r. Setting
LOGICAL_OP_NON_SHORT_CIRCUIT to 0 resulted in a 26% decrease in dynamic
instruction count and a 13.4% performance improvement. After a
On Wed, 2023-12-13 at 14:17 +0800, Jiahao Xu wrote:
> This test was extracted from the hot functions of 526.blender_r. Setting
> LOGICAL_OP_NON_SHORT_CIRCUIT to 0 resulted in a 26% decrease in dynamic
> instruction count and a 13.4% performance improvement. After applying
> the patch mentioned a
On Fri, 8 Dec 2023, Haochen Jiang wrote:
> +++ b/htdocs/gcc-13/changes.html
> +Based on ISA extensions enabled on Alder Lake, the switch further enables
> +the AVX-IFMA, AVX-VNNI-INT8, AVX-NE-CONVERT, CMPccXADD, ENQCMD and UINTR
> +ISA extensions.
Personally I would alphabetically sor
在 2023/12/13 上午2:27, Xi Ruoyao 写道:
On Tue, 2023-12-12 at 20:39 +0800, Xi Ruoyao wrote:
On Tue, 2023-12-12 at 19:59 +0800, Jiahao Xu wrote:
I guess here the problem is floating-point compare instruction is much
more costly than other instructions but the fact is not correctly
modeled yet. Cou
Fix VSETVL BUG that AVL is polluted
.L15:
li a3,9
lui a4,%hi(s)
sw a3,%lo(j)(t2)
sh a5,%lo(s)(a4) <--a4 is hold the address of s
beq t0,zero,.L42
sw t5,8(t4)
vsetvli zero,a4,e8,m8,ta,ma <<--- a4 as avl
Actually,
On Mon, 27 Nov 2023, Jiang, Haochen wrote:
>> How about changing this to use "and", as in
>> "The switch enables the AMX-FP16, PREFETCHI ISA extensions."
>> ?
> Ok for me.
Done and pushed thusly.
Gerald
commit 617a25d7d89a9cce121e85b693eed1ee3f94354b
Author: Gerald Pfeifer
Date: Wed Dec 13
On 12/12/23 8:36 PM, Jason Merrill wrote:
> This test is failing for me below C++17, I think you need
>
> // { dg-do compile { target c++17 } }
> or
> // { dg-require-effective-target c++17 }
Sorry about that. Should we do the above or should we just add
-std=c++17 to dg-options? ...or do we ne
On 12/12/23 12:50, Jason Merrill wrote:
On 12/12/23 10:24, Jason Merrill wrote:
On 12/12/23 06:15, Jakub Jelinek wrote:
On Tue, Dec 12, 2023 at 02:13:43PM +0300, Alexander Monakov wrote:
On Tue, 12 Dec 2023, Jakub Jelinek wrote:
On Mon, Dec 11, 2023 at 05:00:50PM -0500, Jason Merrill wrote
On Tue, 12 Dec 2023 19:24:51 PST (-0800), zengx...@eswincomputing.com wrote:
This patch would like to add new sub extension (aka Zvfbfmin) to the
-march= option. It introduces a new data type BF16.
Depending on different usage scenarios, the Zvfbfmin extension may
depend on 'V' or 'Zve32f'. This
I can't actually find anything in the ISA manual that makes Ztso imply
A. In theory the memory ordering is just a different thing that the set
of availiable instructions (ie, Ztso without A would still imply TSO for
loads and stores). It also seems like a configuration that could be
sane to build
Maciej W. Rozycki wrote:
Add support for the `test_timeout_factor' global variable letting a test
case scale the wait timeout used for code execution. This is useful for
particularly slow test cases for which increasing the wait timeout
globally would be excessive.
* baseboards/qemu.
This patch would like to add new sub extension (aka Zvfbfmin) to the
-march= option. It introduces a new data type BF16.
Depending on different usage scenarios, the Zvfbfmin extension may
depend on 'V' or 'Zve32f'. This patch only implements dependencies
in scenario of Embedded Processor. In scena
[sorry that the previous, unfinished post got through]
On Dec 12, 2023, Richard Biener wrote:
> On Tue, Dec 12, 2023 at 3:03 AM Alexandre Oliva wrote:
>> DECL_NOT_GIMPLE_REG_P (arg) = 0;
> I wonder why you clear this at all?
That code seems to be inherited from expand_thunk.
ISTR that flag w
On Dec 12, 2023, Richard Biener wrote:
> On Tue, Dec 12, 2023 at 3:03 AM Alexandre Oliva wrote:
>> DECL_NOT_GIMPLE_REG_P (arg) = 0;
> I wonder why you clear this at all?
That code seems to be inherited from expand_thunk.
ISTR that flag was not negated when I started the strub implementation,
Hi Jakub & Andrew,
on 2023/12/12 22:42, Jakub Jelinek wrote:
> On Tue, Dec 12, 2023 at 09:33:38AM -0500, Andrew MacLeod wrote:
>> I leave this for the release managers, but I am not opposed to it for this
>> release... It would be nice to remove it for the next release
>
> I can live with it for
在 2023/12/13 上午2:27, Xi Ruoyao 写道:
On Tue, 2023-12-12 at 20:39 +0800, Xi Ruoyao wrote:
fld.s $f1,$r4,0
fld.s $f0,$r4,4
fld.s $f3,$r4,8
fld.s $f2,$r4,12
fcmp.slt.s $fcc1,$f0,$f3
fcmp.sgt.s $fcc0,$f1,$f2
movcf2gr$r
On 12/12/23 17:50, Peter Bergner wrote:
On 12/12/23 1:26 PM, Richard Biener wrote:
Am 12.12.2023 um 19:51 schrieb Peter Bergner :
On 12/12/23 12:45 PM, Peter Bergner wrote:
+/* PR target/112822 */
Oops, this should be:
/* PR tree-optimization/112822 */
It's fixed on my end.
Ok
Pushed
Hi all,
This patch will fix the testcase fail previously introduced.
Approved by another thread:
https://gcc.gnu.org/pipermail/gcc-patches/2023-December/640288.html
Pushed to trunk.
Thx,
Haochen
gcc/testsuite/ChangeLog:
* gcc.target/i386/pr110790-2.c: Change scan-assembler from shrq
> > On the other hand, a new EVEX-capable level might bring earlier adoption
> > of EVEX capabilities to AMD CPUs, which still should be an improvement
> > over AVX2. This could benefit AMD as well. So I would really like to
> > see some AMD feedback here.
> >
> > There's also the matter that tim
On LoongArch architecture, using the latest gcc14 in regression test,
it is found that the vector test cases in vector directory appear FAIL
entries with unmatched pointer types. In order to solve this kind of
problem, the type of the variable in the check result is modified with
the parameter type
在 2023/12/13 上午2:27, Xi Ruoyao 写道:
fld.s $f1,$r4,0
fld.s $f0,$r4,4
fld.s $f3,$r4,8
fld.s $f2,$r4,12
fcmp.slt.s $fcc1,$f0,$f3
fcmp.sgt.s $fcc0,$f1,$f2
movcf2gr$r13,$fcc1
movcf2gr$r12,$fcc0
o
On Tue, Dec 12, 2023 at 12:22 AM Andrew Pinski wrote:
>
> Ccmp is not used if the result of the and/ior is used by both
> a GIMPLE_COND and a GIMPLE_ASSIGN. This improves the code generation
> here by using ccmp in this case.
> Two changes is required, first we need to allow the outer statement's
Hi,
"Kewen.Lin" writes:
> Hi Jeff,
>
> on 2023/12/11 11:26, Jiufu Guo wrote:
>> Hi,
>>
>> Trunk gcc supports more constants to be built via two instructions:
>> e.g. "li/lis; xori/xoris/rldicl/rldicr/rldic".
>> And then num_insns_constant should also be updated.
>>
>> Function "rs6000_emit_s
Hi,
"Kewen.Lin" writes:
> Hi,
>
> on 2023/12/11 11:26, Jiufu Guo wrote:
>> Hi,
>>
>> For constant building e.g. r120=0x, which does not fit 'li or lis',
>> 'pli' is used to build this constant via 'emit_move_insn'.
>>
>> While for a complicated constant, e.g. 0xULL, w
On 12/12/23 14:29, Jason Xu wrote:
Support was recently added for class-level warmth attributes that are
propagated to member functions. The current implementation ignores
member function templates and this patch fixes that.
Thanks! I'm applying this variant of the patch:
From c762599f112aa3b3
On Tue, Dec 12, 2023 at 10:38 PM Jan Hubicka wrote:
>
> Hi,
> this patch disables use of FMA in matrix multiplication loop for generic (for
> x86-64-v3) and zen4. I tested this on zen4 and Xenon Gold Gold 6212U.
>
> For Intel this is neutral both on the matrix multiplication microbenchmark
> (att
On Tue, 12 Dec 2023, Patrick Palka wrote:
> Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK
> for trunk?
>
> -- >8 --
>
> When unifying constants we need to generally treat constants of
> different types but same value as different, in light of auto template
> parameters. T
Hello-
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110558
This is a small fix for the libcpp issue noted in the PR. Bootstrap +
regtest all languages on x86-64 Linux. Is it ok for trunk please?
Also, it's not a regression, having never worked since __has_include was
introduced in GCC 5, but FWI
On 12/12/23 07:04, Maciej W. Rozycki wrote:
Add support for the `dg-test-timeout-factor' keyword letting a test
case scale the wait timeout used for code execution, analogously to
`dg-timeout-factor' used for code compilation. This is useful for
particularly slow test cases for which increasi
On 12/12/23 07:04, Maciej W. Rozycki wrote:
Add support for the `test_timeout_factor' global variable letting a test
case scale the wait timeout used for code execution. This is useful for
particularly slow test cases for which increasing the wait timeout
globally would be excessive.
On 12/12/23 1:26 PM, Richard Biener wrote:
>> Am 12.12.2023 um 19:51 schrieb Peter Bergner :
>>
>> On 12/12/23 12:45 PM, Peter Bergner wrote:
>>> +/* PR target/112822 */
>>
>> Oops, this should be:
>>
>> /* PR tree-optimization/112822 */
>>
>> It's fixed on my end.
>
> Ok
Pushed now that Martin
On Fri, Dec 08, 2023 at 11:09:15PM -0500, Jason Merrill wrote:
> On 12/8/23 16:15, Marek Polacek wrote:
> > On Fri, Dec 08, 2023 at 12:09:18PM -0500, Jason Merrill wrote:
> > > On 12/5/23 15:31, Marek Polacek wrote:
> > > > Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?
> > > >
> > >
Tested x86_64-linux. Pushed to trunk.
-- >8--
When I added a fast path for std::format("{}", x) in
r14-5587-g41a5ea4cab2c59 I forgot to handle char separately from other
integral types. That caused std::format("{}", 'c') to return "99"
instead of "c".
libstdc++-v3/ChangeLog:
* include/s
Tested x86_64-linux. Pushed to trunk.
-- >8--
During discussion of LWG 4022 I noticed that we do not correctly
implement floored division for the century. We were just truncating
towards zero, rather than applying the floor function. For negative
values that rounds the wrong way.
libstdc++-v3/Ch
Tested x86_64-linux. Pushed to trunk.
-- >8--
In r14-4060-gc4baeaecbbf7d0 I moved some files from src/c++98 to
src/c++11 but I didn't remove the redundant -std=gnu++11 flags for those
files. The flags aren't needed now, because AM_CXXFLAGS for that
directory already uses -std=gnu++11. This remove
The BTF specification does not formally define a representation for
forward-declared enum types such as:
enum Foo;
Forward-declarations for struct and union types are represented by
BTF_KIND_FWD, which has a 1-bit flag distinguishing the two.
The de-facto standard format used by other tools li
Given that it's almost verbatim aarch64's implementation and the
general approach appears sensible, LGTM.
Regards
Robin
Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK
for trunk?
-- >8 --
When unifying constants we need to generally treat constants of
different types but same value as different, in light of auto template
parameters. This patch fixes this in a minimal way; it seems we could
ge
On Tue, Dec 12, 2023 at 07:29:40PM +, Jason Xu wrote:
> Support was recently added for class-level warmth attributes that are
> propagated to member functions. The current implementation ignores
> member function templates and this patch fixes that.
Thanks for the patch. Is there a bug in the
Spec:
github.com/openhwgroup/core-v-sw/blob/master/specifications/corev-builtin-spec.md
Contributors:
Mary Bennett
Nandni Jamnadas
Pietra Ferreira
Charlie Keaney
Jessica Mills
Craig Blackmore
Simon Cook
Jeremy Bennett
Helene Chelin
gcc/ChangeLog:
* common/config/
On 12/12/23 13:40, Patrick Palka wrote:
Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look
OK for trunk?
OK.
I considered removing the is_overloaded_fn test now as
well, but it could in theory be hit (and not subsumed by the
type_unknown_p test) for e.g. OVERLOAD of a single FU
gcc/ChangeLog:
* config/riscv/constraints.md: CVP2 -> CV_alu_pow2.
* config/riscv/corev.md: Likewise.
---
gcc/config/riscv/constraints.md | 15 ---
gcc/config/riscv/corev.md | 4 ++--
2 files changed, 10 insertions(+), 9 deletions(-)
diff --git a/gcc/config/risc
Thank you for reviewing my patches!
v1 -> v2:
* Bring the MEM into the operand for cv.elw. The new predicate is
move_operand.
* Add comment to riscv.md detailing why corev.md must appear before
the generic riscv instructions.
v2 -> v3:
* Merge patterns for CORE-V branch immediate an
Spec:
github.com/openhwgroup/core-v-sw/blob/master/specifications/corev-builtin-spec.md
Contributors:
Mary Bennett
Nandni Jamnadas
Pietra Ferreira
Charlie Keaney
Jessica Mills
Craig Blackmore
Simon Cook
Jeremy Bennett
Helene Chelin
gcc/ChangeLog:
* common/config/
Support was recently added for class-level warmth attributes that are
propagated to member functions. The current implementation ignores
member function templates and this patch fixes that.
gcc/cp/ChangeLog:
* class.cc (propagate_class_warmth_attribute): fix warmth
propagation f
> Am 12.12.2023 um 19:51 schrieb Peter Bergner :
>
> On 12/12/23 12:45 PM, Peter Bergner wrote:
>> +/* PR target/112822 */
>
> Oops, this should be:
>
> /* PR tree-optimization/112822 */
>
> It's fixed on my end.
Ok
Richard
> Peter
>
>
>
>
On 12/12/23 12:45 PM, Peter Bergner wrote:
> +/* PR target/112822 */
Oops, this should be:
/* PR tree-optimization/112822 */
It's fixed on my end.
Peter
On 12/12/23 10:50 AM, Martin Jambor wrote:
> The testcase has reasonable size but it is specific to ppc64le and its
> altivec vectors. My plan is to ask the bug reporter to massage it into
> a target specific testcase in bugzilla. Alternatively I can try to
> craft a testcase from scratch but tha
After r14-6455 this no longer fails.
gcc/testsuite/ChangeLog:
* gcc.target/loongarch/vect-ftint-no-inexact.c (xfail): Remove.
---
Tested on loongarch64-linux-gnu. Pushed as obvious.
gcc/testsuite/gcc.target/loongarch/vect-ftint-no-inexact.c | 3 +--
1 file changed, 1 insertion(+), 2 d
Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look
OK for trunk? I considered removing the is_overloaded_fn test now as
well, but it could in theory be hit (and not subsumed by the
type_unknown_p test) for e.g. OVERLOAD of a single FUNCTION_DECL. I
wonder if that's something we'd s
On Tue, 2023-12-12 at 20:39 +0800, Xi Ruoyao wrote:
> On Tue, 2023-12-12 at 19:59 +0800, Jiahao Xu wrote:
> > > I guess here the problem is floating-point compare instruction is much
> > > more costly than other instructions but the fact is not correctly
> > > modeled yet. Could you try
> > > http
On 30/11/2023 12:55, Stamatis Markianos-Wright wrote:
Hi Andre,
Thanks for the comments, see latest revision attached.
On 27/11/2023 12:47, Andre Vieira (lists) wrote:
Hi Stam,
Just some comments.
+/* Recursively scan through the DF chain backwards within the basic
block and
+ determin
On 12/12/23 10:24, Jason Merrill wrote:
On 12/12/23 06:15, Jakub Jelinek wrote:
On Tue, Dec 12, 2023 at 02:13:43PM +0300, Alexander Monakov wrote:
On Tue, 12 Dec 2023, Jakub Jelinek wrote:
On Mon, Dec 11, 2023 at 05:00:50PM -0500, Jason Merrill wrote:
In discussion of PR71093 it came up tha
Tested x86_64-pc-linux-gnu, applying to trunk.
-- 8< --
This testcase uses variable templates, a C++14 feature.
gcc/testsuite/ChangeLog:
* g++.dg/ext/is_nothrow_constructible8.C: Require C++14.
---
gcc/testsuite/g++.dg/ext/is_nothrow_constructible8.C | 2 +-
1 file changed, 1 insertion
On 11/29/23 21:10, Joern Rennecke wrote:
I originally computed mmask in carry_backpropagate from XEXP (x, 0),
but abandoned that when I realized we also get called for RTX_OBJ
things. I forgot to adjust the SIGN_EXTEND code, though. Fixed
in the attached revised patch. Also made sure to n
> Am 12.12.2023 um 17:50 schrieb Martin Jambor :
>
> Hi,
>
> PR 112822 revealed a corner case in load_assign_lhs_subreplacements
> where it creates invalid gimple: an assignment where on the LHS there
> is a complex variable which however is not a gimple register because
> it has partial defs
On Tue, 12 Dec 2023, Richard Biener wrote:
> On Tue, Dec 12, 2023 at 3:38 PM Jan Hubicka wrote:
> >
> > Hi,
> > this patch disables use of FMA in matrix multiplication loop for generic
> > (for
> > x86-64-v3) and zen4. I tested this on zen4 and Xenon Gold Gold 6212U.
> >
> > For Intel this is
Hi,
PR 112822 revealed a corner case in load_assign_lhs_subreplacements
where it creates invalid gimple: an assignment where on the LHS there
is a complex variable which however is not a gimple register because
it has partial defs and on the right hand side there is a
VIEW_CONVERT_EXPR. This patc
>
> This came up in a separate thread as well, but when doing reassoc of a
> chain with
> multiple dependent FMAs.
>
> I can't understand how this uarch detail can affect performance when
> as in the testcase
> the longest input latency is on the multiplication from a memory load.
> Do we actuall
On Mon, 11 Dec 2023, Richard Sandiford wrote:
> > It all seems a bit hackish. I don't think ports have had much success
> > using 'p' through the decades. I think I generally ended up having to
> > go with distinct constraints rather than relying on 'p'.
> >
> > OK for the trunk, but ewww.
>
On 12/12/2023 10:05, Tobias Burnus wrote:
Hi Andrew,
On 11.12.23 18:04, Andrew Stubbs wrote:
This creates a new predefined allocator as a shortcut for using pinned
memory with OpenMP. The name uses the OpenMP extension space and is
intended to be consistent with other OpenMP implementations cu
Alex Coplan writes:
> Hi,
>
> This is a v3 patch which is rebased on top of the SME changes.
> Otherwise it is the same as v2, posted here:
>
> https://gcc.gnu.org/pipermail/gcc-patches/2023-December/639367.html
>
> Bootstrapped/regtested as a series on aarch64-linux-gnu, OK for trunk?
>
> Thanks,
Alex Coplan writes:
> Hi,
>
> This is a v2 version which addresses feedback from Richard's review
> here:
>
> https://gcc.gnu.org/pipermail/gcc-patches/2023-November/637648.html
>
> I'll reply inline to address specific comments.
>
> Bootstrapped/regtested on aarch64-linux-gnu, OK for trunk?
>
> T
Robin Dapp writes:
>> - Change the second mode to vec_extract_optab. This is only a name
>> lookup, and it seems more natural to continue using the real element mode.
>
> Am I understanding correctly that this implies we should provide
> a vec_extractbi expander? (with the innermode being BImo
Hi Arthur, Pierre-Emmanuel!
On 2023-12-12T10:39:50+0100, I wrote:
> On 2023-11-27T16:46:08+0100, I wrote:
>> On 2023-11-21T16:20:22+0100, Arthur Cohen wrote:
>>> On 11/20/23 15:55, Thomas Schwinge wrote:
Arthur and Pierre-Emmanuel have prepared a GCC/Rust libgrust-v2/to-submit
branch: <
On 12/12/23 06:15, Jakub Jelinek wrote:
On Tue, Dec 12, 2023 at 02:13:43PM +0300, Alexander Monakov wrote:
On Tue, 12 Dec 2023, Jakub Jelinek wrote:
On Mon, Dec 11, 2023 at 05:00:50PM -0500, Jason Merrill wrote:
In discussion of PR71093 it came up that more clobber_kind options would be
use
The branch-protection types are target specific, not the same on arm
and aarch64. This currently affects pac-ret+b-key, but there will be
a new type on aarch64 that is not relevant for arm.
After the move, change aarch_ identifiers to aarch64_ or arm_ as
appropriate.
gcc/ChangeLog:
* co
> - Change the second mode to vec_extract_optab. This is only a name
> lookup, and it seems more natural to continue using the real element mode.
Am I understanding correctly that this implies we should provide
a vec_extractbi expander? (with the innermode being BImode
here).
Regards
Robin
On Tue, Dec 12, 2023 at 3:38 PM Jan Hubicka wrote:
>
> Hi,
> this patch disables use of FMA in matrix multiplication loop for generic (for
> x86-64-v3) and zen4. I tested this on zen4 and Xenon Gold Gold 6212U.
>
> For Intel this is neutral both on the matrix multiplication microbenchmark
> (atta
On Mon, 11 Dec 2023, Alexandre Oliva wrote:
> On Dec 11, 2023, Joseph Myers wrote:
>
> > On Fri, 8 Dec 2023, Alexandre Oliva wrote:
> >> @@ -20589,7 +20589,7 @@ allocation before or after interprocedural
> >> optimization.
> >> This option enables multilib-aware @code{TFLAGS} to be used to buil
Robin Dapp writes:
> What also works is something like:
>
> scalar_mode extract_mode = innermode;
> if (GET_MODE_CLASS (outermode) == MODE_VECTOR_BOOL)
> extract_mode = smallest_int_mode_for_size
> (GET_MODE_PRECISION (innermode));
>
> however
>
>> So yes,
On Tue, Dec 12, 2023 at 09:33:38AM -0500, Andrew MacLeod wrote:
> I leave this for the release managers, but I am not opposed to it for this
> release... It would be nice to remove it for the next release
I can live with it for GCC 14, so ok, but it is very ugly.
We should fix it in a better way
Hi,
this patch disables use of FMA in matrix multiplication loop for generic (for
x86-64-v3) and zen4. I tested this on zen4 and Xenon Gold Gold 6212U.
For Intel this is neutral both on the matrix multiplication microbenchmark
(attached) and spec2k17 where the difference was within noise for Core
I leave this for the release managers, but I am not opposed to it for
this release... It would be nice to remove it for the next release
Andrew
On 12/12/23 01:07, Kewen.Lin wrote:
Hi,
Gentle ping this:
https://gcc.gnu.org/pipermail/gcc-patches/2023-December/639140.html
BR,
Kewen
on 2023/
This patch apply vla vs. vls mode heuristic which can fixes the following FAILs:
FAIL: gcc.target/riscv/rvv/autovec/pr111751.c -O3 -ftree-vectorize
scan-assembler-not vset
FAIL: gcc.target/riscv/rvv/autovec/pr111751.c -O3 -ftree-vectorize
scan-assembler-times li\\s+[a-x0-9]+,0\\s+ret 2
The root ca
> On Dec 7, 2023, Alexandre Oliva wrote:
>
> > Thanks for raising the issue. Maybe there should be at least a comment
> > there, and perhaps some asserts to check that pointer and reference
> > types don't make to indirect_parms.
>
> Document why attribute access doesn't need the same treatmen
> The following adds no_icf handling for variables where the attribute
> was rejected. It also fixes the check for no_icf by checking both
> the source and the targets decl.
>
> Bootstrap / regtest running on x86_64-unknown-linux-gnu.
>
> This would solve the AVR issue with merging of "progmem"
The following makes sure to also process the (empty) latch when
performing CSE on the if-converted loop body. That's important
to get all uses of copies propagated out on the backedge as well.
To avoid CSE on the PHI nodes itself which is prohibitive
(see PR90402) this temporarily adds a fake entr
Add support for the `test_timeout_factor' global variable letting a test
case scale the wait timeout used for code execution. This is useful for
particularly slow test cases for which increasing the wait timeout
globally would be excessive.
* baseboards/qemu.exp (qemu_load): Handle `te
Add support for the `dg-test-timeout-factor' keyword letting a test
case scale the wait timeout used for code execution, analogously to
`dg-timeout-factor' used for code compilation. This is useful for
particularly slow test cases for which increasing the wait timeout
globally would be excessive.
Hi,
This patch quasi-series makes it possible for individual test cases
identified as being slow to request more time via the GCC test harness by
providing a test execution timeout factor, applied to the tool execution
timeout set globally for all the test cases. This is to avoid excessive
t
On Tue, Dec 12, 2023 at 1:08 PM Jiawei wrote:
>
> Supports RISC-V profiles[1] in -march option.
>
> Default input set the profile is before other formal extensions.
>
> V2: Fixes some format errors and adds code comments for parse function
> Thanks for Jeff Law's review and comments.
>
> [1]https:
Yes, no harm in doing that. LGTM.
Regards
Robin
On Tue, 12 Dec 2023, Richard Sandiford wrote:
> Richard Biener writes:
> > The following aovids over/under-read of storage when vectorizing
> > a non-grouped load with SLP. Instead of forcing peeling for gaps
> > use a smaller load for the last vector which might access excess
> > elements. Thi
Richard Biener writes:
> The following aovids over/under-read of storage when vectorizing
> a non-grouped load with SLP. Instead of forcing peeling for gaps
> use a smaller load for the last vector which might access excess
> elements. This builds upon the existing optimization avoiding
> peelin
On Tue, 2023-12-12 at 19:59 +0800, Jiahao Xu wrote:
> > I guess here the problem is floating-point compare instruction is much
> > more costly than other instructions but the fact is not correctly
> > modeled yet. Could you try
> > https://gcc.gnu.org/pipermail/gcc-patches/2023-December/640012.htm
On Tue, Dec 12, 2023 at 10:05 AM Florian Weimer wrote:
>
> * Richard Biener:
>
> > If it were possible I'd axe x86_64-v4. Maybe we should add a x86_64-v3.5
> > that sits inbetween v3 and v4, offering AVX512 but restricted to 256bit
> > (and obviously not requiring more of the AVX512 features that
> -原始邮件-
> 发件人: "Jeff Law"
> 发送时间: 2023-12-12 00:15:44 (星期二)
> 收件人: Jiawei , gcc-patches@gcc.gnu.org
> 抄送: kito.ch...@sifive.com, pal...@dabbelt.com, christoph.muell...@vrull.eu
> 主题: Re: [RFC] RISC-V: Support RISC-V Profiles in -march option.
>
>
>
> On 11/20/23 12:14, Jiawei wrote:
>
Hi!
I've noticed
+ERROR: gcc.dg/gomp/pr87887-1.c: syntax error in target selector ".-4" for "
dg-warning 13 "unsupported return type ‘struct S’ for ‘simd’ functions" {
target aarch64*-*-* } .-4 "
+ERROR: gcc.dg/gomp/pr87887-1.c: syntax error in target selector ".-4" for "
dg-warning 13 "unsuppo
Supports RISC-V profiles[1] in -march option.
Default input set the profile is before other formal extensions.
V2: Fixes some format errors and adds code comments for parse function
Thanks for Jeff Law's review and comments.
[1]https://github.com/riscv/riscv-profiles/blob/main/profiles.adoc
gcc
Hi Sandra,
On 07.12.23 16:52, Sandra Loosemore wrote:
This patch introduces enumerators to represent trait-set names and
trait names, which makes it easier to use tables to control other
behavior and for switch statements to dispatch on the tags. The tags
are stored in the same place in the TRE
On Tue, Dec 12, 2023 at 7:12 AM liuhongt wrote:
>
> x86 doesn't support horizontal reduction instructions, reduc_op_scal_m
> is emulated with vec_extract_half + op(half vector length)
> Take that into account when calculating cost for vectorization.
>
> Bootstrapped and regtested on x86_64-pc-linu
在 2023/12/12 下午7:26, Xi Ruoyao 写道:
On Tue, 2023-12-12 at 19:14 +0800, Jiahao Xu wrote:
Define LOGICAL_OP_NON_SHORT_CIRCUIT as 0, for a short-circuit branch, use the
short-circuit operation instead of the non-short-circuit operation.
This gives a 1.8% improvement in SPECCPU 2017 fprate on 3A60
On Tue, Dec 12, 2023 at 3:03 AM Alexandre Oliva wrote:
>
>
> When generating code for an internal strub wrapper, don't clear the
> DECL_NOT_GIMPLE_REG_P flag of volatile args, and gimplify them both
> before and after any conversion.
>
> While at that, move variable TMP into narrower scopes so tha
This is a mostly straight port from the gcov-19.c tests from the C test
suite. The only notable differences from C to D are that D flips the
true/false outcomes for loop headers, and the D front end ties loop and
ternary conditions to slightly different locus.
The test for >64 conditions warning i
1 - 100 of 133 matches
Mail list logo