From: Naveen H S
This patch adds support scalar_storage_order attribute to C++ front-end.
It treats the opposite order fields similar as the packed fields are
treated such that they will not bind to references.
For arrays, the attributes applies to the inner type rather than the array
type simila
On Wed, May 24, 2023 at 8:36 PM Alexander Monakov wrote:
>
>
> On Wed, 24 May 2023, Richard Biener via Gcc-patches wrote:
>
> > I’d have to check the ISAs what they actually do here - it of course depends
> > on RTL semantics as well but as you say those are not strictly defined here
> > either.
>
From: Juzhe-Zhong
Currenty mode switching incorrect codegen for the following case:
void fn (void);
void f (void * in, void *out, int32_t x, int n, int m)
{
for (int i = 0; i < n; i++) {
vint32m1_t v = __riscv_vle32_v_i32m1 (in + i, 4);
vint32m1_t v2 = __riscv_vle32_v_i32m1_tu (v, in +
On Thu, 25 May 2023 at 01:28, Richard Sandiford
wrote:
>
> Prathamesh Kulkarni writes:
> > On Wed, 24 May 2023 at 15:40, Richard Sandiford
> > wrote:
> >>
> >> Prathamesh Kulkarni writes:
> >> > On Mon, 22 May 2023 at 14:18, Richard Sandiford
> >> > wrote:
> >> >>
> >> >> Prathamesh Kulkarni
On Wed, May 24, 2023 at 5:44 PM Georg-Johann Lay wrote:
>
>
>
> Am 24.05.23 um 11:38 schrieb Richard Biener:
> > On Tue, May 23, 2023 at 2:56 PM Georg-Johann Lay wrote:
> >>
> >> PR target/104327 not only affects s390 but also avr:
> >> The avr backend pre-sets some options depending on optimizat
On Wed, May 24, 2023 at 4:41 PM Eric Botcazou wrote:
>
> > But nobody is going to understand why the INTEGER_CST case goes the
> > other way.
>
> I can add a fat comment to that effect of course. :-)
>
> > As you say we don't have a good way to say we're doing
> > this to avoid undefined behavior,
Committed, thanks Kito.
Pan
-Original Message-
From: Gcc-patches On Behalf
Of Kito Cheng via Gcc-patches
Sent: Thursday, May 25, 2023 12:02 PM
To: juzhe.zh...@rivai.ai
Cc: gcc-patches@gcc.gnu.org; pal...@rivosinc.com; rdapp@gmail.com;
jeffreya...@gmail.com
Subject: Re: [PATCH] RISC
Hi Alexandre,
on 2023/5/24 13:51, Alexandre Oliva wrote:
>
> Codegen changes caused add instruction count mismatches on
> ppc-*-linux-gnu and other 32-bit ppc targets. At some point the
> expected counts were adjusted for lp64, but ilp32 differences
> remained, and published test results confirm
On 24 May 2023 16:09:21 CEST, Qing Zhao wrote:
>Bernhard,
>
>Thanks a lot for your comments.
>
>> On May 19, 2023, at 7:11 PM, Bernhard Reutner-Fischer
>> wrote:
>>
>> On Fri, 19 May 2023 20:49:47 +
>> Qing Zhao via Gcc-patches wrote:
>>
>>> GCC extension accepts the case when a struct wi
NANs don't have bounds, so there's no need to stream them out.
gcc/ChangeLog:
* data-streamer-in.cc (streamer_read_value_range): Handle NANs.
* data-streamer-out.cc (streamer_write_vrange): Same.
* value-range.h (class vrange): Make streamer_write_vrange a friend.
---
gcc
frange::set() is confusing in that we can set a NAN by specifying a
bound of +-NAN, even though we tecnically disallow NANs in the setter
because the kind can never be VR_NAN. This is a wart for
get_tree_range(), which builds a range out of a tree from the source,
to work correctly. It's ugly, an
We're ICEing when trying to hash a known NAN. This is unnoticeable
because the only user would be IPA, and even so, it currently doesn't
handle floats. However, handling floats is a flip of a switch, so
it's best to handle them already.
gcc/ChangeLog:
* value-range.cc (add_vrange): Hand
Generalize frange::set_nan() to take a nan_state and make current
set_nan() methods syntactic sugar.
This is in preparation for better streaming of NANs for LTO/IPA.
gcc/ChangeLog:
* value-range.h (frange::set_nan): New.
---
gcc/value-range.h | 32 +---
1 fil
on 2023/5/24 23:20, Carl Love wrote:
> On Wed, 2023-05-24 at 13:32 +0800, Kewen.Lin wrote:
>> on 2023/5/24 06:30, Peter Bergner wrote:
>>> On 5/23/23 12:24 AM, Kewen.Lin wrote:
on 2023/5/23 01:31, Carl Love wrote:
> The builtins were requested for use in GLibC. As of version
> 2.31 th
>> It's highly unlikely we'll switch from the mechanisms we're using.
>>They're pretty deeply embedded into how all the ports are developed and
>>work.
We just take a look at the build file. It seems that the functions generated by
define_insn
are so many. Do we have the chance optimize it?
I be
Yeah, JoJo still working on toolchain stuff, but just not active on upstream GCC
cc. jojo
On Thu, May 25, 2023 at 12:06 PM Jeff Law wrote:
>
>
>
> On 5/24/23 21:53, Kito Cheng wrote:
> > Jojo has a patch to try to split those things that should help this,
> > but seems not landed.
> >
> > https:
On 5/24/23 21:54, juzhe.zh...@rivai.ai wrote:
>> IIRC LLVM is using the table driven mechanism, so it's less impact
on the
compilation time when the instruction becomes more and more.
Oh, I see. Could you share more details ?
Maybe we can support this in GCC.
It's highly unlikely we'll swit
On 5/24/23 21:53, Kito Cheng wrote:
Jojo has a patch to try to split those things that should help this,
but seems not landed.
https://patchwork.ozlabs.org/project/gcc/patch/20201104015315.81416-1-jiejie_r...@c-sky.com/
Is JoJo still active? I haven't heard from JoJo in many months, perhaps
LGTM, thanks :)
On Wed, May 24, 2023 at 7:26 PM wrote:
>
> From: Juzhe-Zhong
>
> According to RVV ISA:
> The conversions use the dynamic rounding mode in frm, except for the rtz
> variants, which round towards zero.
>
> So rtz conversion patterns should not have FRM dependency.
>
> We can't sup
>> IIRC LLVM is using the table driven mechanism, so it's less impact on the
>> compilation time when the instruction becomes more and more.
Oh, I see. Could you share more details ?
Maybe we can support this in GCC.
juzhe.zh...@rivai.ai
From: Kito Cheng
Date: 2023-05-25 11:53
To: juzhe.zh...@
Jojo has a patch to try to split those things that should help this,
but seems not landed.
https://patchwork.ozlabs.org/project/gcc/patch/20201104015315.81416-1-jiejie_r...@c-sky.com/
> How about LLVM? Can kito help with this issue?
> LLVM has already supported full intrinsics for a long time an
Besides, we don't have compilation issues in crossing-compiling (with segment
intrinsics).
But I do agree we need to address such issue.
As far as I known, GCC compile insn-emit in single thread single core.
Can we multi-thread && multi-core to compile it to speed up the compilation?
Thanks.
j
segment intrinsics are really huge amount.
Even though I have tried to optimized them, still we have the issues..
How about LLVM? Can kito help with this issue?
LLVM has already support full intrinsics for a long time and no issues.
Thanks.
juzhe.zh...@rivai.ai
From: Jeff Law
Date: 202
On 5/24/23 17:13, Palmer Dabbelt wrote:
On Wed, 24 May 2023 16:12:20 PDT (-0700), Vineet Gupta wrote:
[ ... big snip ... ]
Never mind. Looks like I found the issue - with just trial and error and
no idea of how this stuff works.
The torture-{init,finish} needs to be in riscv.exp not rvv.e
在 2023/5/25 上午10:52, WANG Xuerui 写道:
On 2023/5/25 10:46, Lulu Cheng wrote:
在 2023/5/25 上午4:15, Jason Merrill 写道:
On Wed, May 24, 2023 at 5:00 AM Jonathan Wakely via Gcc-patches
mailto:gcc-patches@gcc.gnu.org>> wrote:
On Wed, 24 May 2023 at 09:41, Xi Ruoyao wrote:
> Wang Lei rais
On Thu, May 25, 2023 at 10:55 AM Hu, Lin1 via Gcc-patches
wrote:
>
> Hi all,
>
> This patch aims to fix incorrect intrinsic signature for
> _mm{512|256|}_s{lli|rai|rli}_epi*. And it has been tested on
> x86_64-pc-linux-gnu. OK for trunk?
>
> BRs,
> Lin
>
> gcc/ChangeLog:
>
> PR target/10
Oops, forget to remove it in previous version, will wait a while and update
them together.
Pan
From: juzhe.zh...@rivai.ai
Sent: Thursday, May 25, 2023 11:14 AM
To: Li, Pan2 ; gcc-patches
Cc: Kito.cheng ; Li, Pan2 ; Wang,
Yanzhang
Subject: Re: [PATCH v6] RISC-V: Using merge approach to optimi
* machmode.h (VECTOR_BOOL_MODE_P): New macro.
--- a/gcc/machmode.h
+++ b/gcc/machmode.h
@@ -134,6 +134,10 @@ extern const unsigned char mode_class[NUM_MACHINE_MODES];
|| GET_MODE_CLASS (MODE) == MODE_VECTOR_ACCUM \
|| GET_MODE_CLASS (MODE) == MODE_VECTOR_UACCUM)
+/* Nonzero if MODE
Hi Kito,
Update the PATCH v6 with refactored framework as below, thanks for comments.
https://gcc.gnu.org/pipermail/gcc-patches/2023-May/619536.html
Pan
-Original Message-
From: Gcc-patches On Behalf
Of Kito Cheng via Gcc-patches
Sent: Wednesday, May 17, 2023 11:52 AM
To: juzhe.zh...@
From: Pan Li
This patch would like to optimize the VLS vector initialization like
repeating sequence. From the vslide1down to the vmerge with a simple
cost model, aka every instruction only has 1 cost.
Given code with -march=rv64gcv_zvl256b --param
riscv-autovec-preference=fixed-vlmax
typedef i
Hi, Richard.
After several tries with your testcases (I already added into V15 patch).
I think "using a new IV" would be better than "multiplication"
Now:
loop_len_34 = MIN_EXPR ;
_74 = MIN_EXPR ; --> multiplication approach will changed
into _74 = loop_len_34 * 2;
loop_len_48 = MIN
From: Ju-Zhe Zhong
This patch is supporting decrement IV by following the flow designed by Richard:
(1) In vect_set_loop_condition_partial_vectors, for the first iteration of:
call vect_set_loop_controls_directly.
(2) vect_set_loop_controls_directly calculates "step" as in your patch.
If rg
Hi all,
This patch aims to fix incorrect intrinsic signature for
_mm{512|256|}_s{lli|rai|rli}_epi*. And it has been tested on
x86_64-pc-linux-gnu. OK for trunk?
BRs,
Lin
gcc/ChangeLog:
PR target/109173
PR target/109174
* config/i386/avx512bwintrin.h (_mm512_srli_epi16)
On 2023/5/25 10:46, Lulu Cheng wrote:
在 2023/5/25 上午4:15, Jason Merrill 写道:
On Wed, May 24, 2023 at 5:00 AM Jonathan Wakely via Gcc-patches
mailto:gcc-patches@gcc.gnu.org>> wrote:
On Wed, 24 May 2023 at 09:41, Xi Ruoyao wrote:
> Wang Lei raised some concerns about Itanium C++ ABI,
在 2023/5/25 上午4:15, Jason Merrill 写道:
On Wed, May 24, 2023 at 5:00 AM Jonathan Wakely via Gcc-patches
mailto:gcc-patches@gcc.gnu.org>> wrote:
On Wed, 24 May 2023 at 09:41, Xi Ruoyao wrote:
> Wang Lei raised some concerns about Itanium C++ ABI, so let's
ask a C++
> expert her
Hi,
This is the 8th version of the patch, which rebased on the latest trunk.
This is an important patch needed by Linux Kernel security project.
compared to the 7th version, the major change are:
1. update the documentation wordings based on Joseph's suggestions.
2. change the name of the new ma
on a structure with a C99 flexible array member being nested in
another structure.
"The GCC extension accepts a structure containing an ISO C99 "flexible array
member", or a union containing such a structure (possibly recursively)
to be a member of a structure.
There are two situations:
* A
GCC extension accepts the case when a struct with a C99 flexible array member
is embedded into another struct or union (possibly recursively) as the last
field.
__builtin_object_size should treat such struct as flexible size.
gcc/c/ChangeLog:
PR tree-optimization/101832
* c-decl.c
> > +rewrite_expr_tree_parallel (gassign *stmt, int width, bool has_fma,
> > +const vec
> > +&ops)
> > {
> >enum tree_code opcode = gimple_assign_rhs_code (stmt);
> >int op_num = ops.length ();
> > @@ -5483,10 +5494,11 @@ rewrite_expr_tree_parallel (
From: Lili Cui
Make some changes in reassoc pass to make it more friendly to fma pass later.
Using FMA instead of mult + add reduces register pressure and insruction
retired.
There are mainly two changes
1. Put no-mult ops and mult ops alternately at the end of the queue, which is
conducive to g
On 5/24/23 17:12, Vineet Gupta wrote:
On 5/24/23 15:13, Vineet Gupta wrote:
PASS: gcc.target/riscv/zmmul-2.c -O2 -flto -fuse-linker-plugin
-fno-fat-lto-objects (test for excess errors)
PASS: gcc.target/riscv/zmmul-2.c -O2 -flto -fuse-linker-plugin
-fno-fat-lto-objects scan-assembl
`This patch tries to prevent generating unnecessary sign extension
after *w instructions like "addiw" or "divw".
The main idea of it is to add SUBREG_PROMOTED fields during expanding.
I have tested on SPEC2017 there is no regression.
Only gcc.dg/pr30957-1.c test failed.
To solve that I did some c
On Wed, 24 May 2023 16:12:20 PDT (-0700), Vineet Gupta wrote:
On 5/24/23 15:13, Vineet Gupta wrote:
PASS: gcc.target/riscv/zmmul-2.c -O2 -flto -fuse-linker-plugin
-fno-fat-lto-objects (test for excess errors)
PASS: gcc.target/riscv/zmmul-2.c -O2 -flto -fuse-linker-plugin
-fno-fat-lto-obj
On 5/24/23 15:13, Vineet Gupta wrote:
PASS: gcc.target/riscv/zmmul-2.c -O2 -flto -fuse-linker-plugin
-fno-fat-lto-objects (test for excess errors)
PASS: gcc.target/riscv/zmmul-2.c -O2 -flto -fuse-linker-plugin
-fno-fat-lto-objects scan-assembler-times mul\t 1
PASS: gcc.target/riscv/z
On 5/24/23 13:34, Thomas Schwinge wrote:
Yeah, at this point I'm not sure whether my recent changes really are
related/relevant here.
Apparently in addition to Kito's patch below, If I comment out the
additional torture options, failures go down drastically.
Meaning that *all* those ERRORs dis
By having an ssa_cache inherit from a range_query, and then providing a
range_of_expr routine which returns the current global value, we open up
the possibility of folding statements and doing other interesting things
with an ssa-cache.
In particular, you can now call fold_range() with an ssa
This patch provide the framework for a gimple-range phi analyzer.
Currently, the primary purpose is to give better initial values for
members of a "phi group"
a PHI group is defined as a a group of PHI nodes whose arguments are all
either members of the same PHI group, or one of 2 other valu
This tweaks someof the fold_stmt routines and helpers.. in particular
the ones which you provide a vector of ranges to to satisfy any ssa-names.
Previously, once the vector was depleted, any remaining values were
picked up from the default get_global_range_query() query. It is useful
to be abl
I originally implemented the lazy ssa cache by inheriting from an
ssa_cache in protected mode and providing the required routines. This
makes it a little awkward to do various things, and they also become not
quite as interchangeable as I'd like. Making the routines virtual and
using proper i
On Tue, 2023-05-23 at 09:34 +, Christophe Lyon wrote:
> The gcc.dg/analyzer/data-model-4.c and
> gcc.dg/analyzer/torture/conftest-1.c fail with recent glibc headers
> and succeed with older headers.
>
> The new error message is:
> warning: use of possibly-NULL 'f' where non-null expected [CWE-
Hi!
On 2023-05-24T11:18:35-0700, Vineet Gupta wrote:
> On 5/22/23 20:52, Vineet Gupta wrote:
>> On 5/22/23 02:17, Kito Cheng wrote:
>>> Ooops, seems still some issue around here,
>>
>> Yep still 5000 fails :-(
>>
>>> but I found something might
>>> related this issue:
>>>
>>> https://github.com
On Wed, May 24, 2023 at 5:00 AM Jonathan Wakely via Gcc-patches <
gcc-patches@gcc.gnu.org> wrote:
> On Wed, 24 May 2023 at 09:41, Xi Ruoyao wrote:
>
> > Wang Lei raised some concerns about Itanium C++ ABI, so let's ask a C++
> > expert here...
> >
> > Jonathan: AFAIK the standard and the Itanium
I'll look at the samples tomorrow, but just to address one thing:
钟居哲 writes:
>>> What gives the best code in these cases? Is emitting a multiplication
>>> better? Or is using a new IV better?
> Could you give me more detail information about "new refresh IV" approach.
> I'd like to try that.
Prathamesh Kulkarni writes:
> On Wed, 24 May 2023 at 15:40, Richard Sandiford
> wrote:
>>
>> Prathamesh Kulkarni writes:
>> > On Mon, 22 May 2023 at 14:18, Richard Sandiford
>> > wrote:
>> >>
>> >> Prathamesh Kulkarni writes:
>> >> > Hi Richard,
>> >> > Thanks for the suggestions. Does the att
On Wed, 24 May 2023 at 15:40, Richard Sandiford
wrote:
>
> Prathamesh Kulkarni writes:
> > On Mon, 22 May 2023 at 14:18, Richard Sandiford
> > wrote:
> >>
> >> Prathamesh Kulkarni writes:
> >> > Hi Richard,
> >> > Thanks for the suggestions. Does the attached patch look OK ?
> >> > Boostrap+tes
Le 21/05/2023 à 22:48, Harald Anlauf via Fortran a écrit :
Dear all,
checking and simplification of the RESHAPE intrinsic could fail in
various ways for sufficiently complicated arguments, like array
constructors. Debugging revealed that in these cases we determined
that the array arguments wer
Le 24/05/2023 à 21:16, Harald Anlauf via Fortran a écrit :
Dear all,
the attached almost obvious patch fixes an ICE on invalid that may
occur when we attempt to simplify an initialization expression with
SIZE for an out-of-range DIM argument. Returning gfc_bad_expr
allows for a more graceful er
Hi Lipeng,
May I know any comment or concern on this patch, thanks for your time 😄
Thanks for your patience in getting this reviewed.
A few remarks / questions.
Which strategy is used in this implementation, read-preferring or
write-preferring? And if read-preferring is used, is there
a dan
Dear all,
the attached almost obvious patch fixes an ICE on invalid that may
occur when we attempt to simplify an initialization expression with
SIZE for an out-of-range DIM argument. Returning gfc_bad_expr
allows for a more graceful error recovery.
Regtested on x86_64-pc-linux-gnu. OK for main
Middle-end folks: any thoughts about how best to make the change described in
the last paragraph below?
Library folks: any thoughts on the changes to __cxa_call_terminate?
-- 8< --
[except.handle]/7 says that when we enter std::terminate due to a throw,
that is considered an active handler. We
gcc/ChangeLog:
* value-range.h (vrange::kind): Remove.
---
gcc/value-range.h | 3 ---
1 file changed, 3 deletions(-)
diff --git a/gcc/value-range.h b/gcc/value-range.h
index 936eb175062..b8cc2a0e76a 100644
--- a/gcc/value-range.h
+++ b/gcc/value-range.h
@@ -100,9 +100,6 @@ public:
boo
On Wed, 24 May 2023, Richard Biener via Gcc-patches wrote:
> I’d have to check the ISAs what they actually do here - it of course depends
> on RTL semantics as well but as you say those are not strictly defined here
> either.
Plus, we can add the following executable test to the testsuite:
#in
+CC Thomas and Maciej
On 5/22/23 20:52, Vineet Gupta wrote:
On 5/22/23 02:17, Kito Cheng wrote:
Ooops, seems still some issue around here,
Yep still 5000 fails :-(
but I found something might
related this issue:
https://github.com/gcc-mirror/gcc/commit/d6654a4be3ba44c0d57be7c8a51d76d9721
My understanding is that GCC's preferred null value for rtx is NULL_RTX
(and for tree is NULL_TREE), and by being typed allows strict type checking,
and use with function polymorphism and template instantiation.
C++'s nullptr is preferred over NULL and 0 for pointer types that don't
have a defined
Hello All:
This patch improves code sinking pass to sink statements before call to reduce
register pressure.
Review comments are incorporated.
For example :
void bar();
int j;
void foo(int a, int b, int c, int d, int e, int f)
{
int l;
l = a + b + c + d +e + f;
if (a != 5)
{
bar(
Hi, Joseph,
I modified the gcc/doc/extend.texi per your suggestion as following:
Let me know if you have further comment and suggestion on this patch.
I will send out the V8 of the patch after some testing.
Thanks.
Qing.
diff --git a/gcc/doc/exten
On Wed, May 24, 2023 at 2:03 AM Richard Biener via Gcc-patches
wrote:
>
> On Wed, May 24, 2023 at 1:16 AM Andrew Pinski via Gcc-patches
> wrote:
> >
> > While trying to understand how to use the ! operand for match
> > patterns, I noticed that the debug dumps would print out applying
> > a patter
Hi, Richard. After I fix codes, now IR is correct I think:
loop_len_34 = MIN_EXPR ;
_74 = loop_len_34 * 2;
loop_len_48 = MIN_EXPR <_74, 4>;
_75 = _74 - loop_len_48;
loop_len_49 = MIN_EXPR <_75, 4>;
_76 = _75 - loop_len_49;
loop_len_50 = MIN_EXPR <_76, 4>;
loop_len_51 = _76 - loop_len
On 5/24/23 10:20 AM, Carl Love wrote:
> Extending the builtin to pre Power 9 is straight forward and I agree
> would make good sense to do.
>
> I am a bit concerned on how to extend __builtin_set_fpscr_rn to add the
> new functionality. Peter suggests overloading the builtin to either
> return vo
> Am 24.05.2023 um 16:21 schrieb Alexander Monakov :
>
>
>> On Wed, 24 May 2023, Richard Biener wrote:
>>> On Wed, May 24, 2023 at 2:54 PM Alexander Monakov via Gcc-patches
>>> wrote:
>>> Explicitly say that bitwise shifts for narrow types work similar to
>>> element-wise C shifts with integ
On Wed, 24 May 2023 at 16:06, Matthias Kretz via Libstdc++ <
libstd...@gcc.gnu.org> wrote:
> OK for master and backports? (also a long-standing bug that didn't surface
> until the new constexpr test was added)
>
OK for all
>
> tested on powerpc64le-linux-gnu
>
> - 8< ---
Hi, For the first piece of code ,I tried:
unsigned int nitems_per_iter
= dest_rgm->max_nscalars_per_iter * dest_rgm->factor;
step = gimple_build (seq, MULT_EXPR, iv_type, step,
build_int_cst (iv_type, nitems_per_iter));
Then optimized IR:
loop_len_34 = MIN_EXPR ;
_
钟居哲 writes:
> Oh. I see. Thank you so much for pointing this.
> Could you tell me what I should do in the codes?
> It seems that I should adjust it in
> vect_adjust_loop_lens_control
>
> muliply by some factor ? Is this correct multiply by max_nscalars_per_iter
> ?
max_nscalars_per_iter * factor
Oh. I see. Thank you so much for pointing this.
Could you tell me what I should do in the codes?
It seems that I should adjust it in
vect_adjust_loop_lens_control
muliply by some factor ? Is this correct multiply by max_nscalars_per_iter
?
Thanks.
juzhe.zh...@rivai.ai
From: Richard Sandiford
钟居哲 writes:
> Hi, Richard. I still don't understand it. Sorry about that.
>
>>> loop_len_48 = MIN_EXPR ;
> >> _74 = loop_len_34 * 2 - loop_len_48;
>
> I have the tests already tested.
> We have a MIN_EXPR to calculate the total elements:
> loop_len_34 = MIN_EXPR ;
> I think "8" is already mul
Am 24.05.23 um 11:38 schrieb Richard Biener:
On Tue, May 23, 2023 at 2:56 PM Georg-Johann Lay wrote:
PR target/104327 not only affects s390 but also avr:
The avr backend pre-sets some options depending on optimization level.
The inliner then thinks that always_inline functions are not eligi
Hi, Richard. I still don't understand it. Sorry about that.
>> loop_len_48 = MIN_EXPR ;
>> _74 = loop_len_34 * 2 - loop_len_48;
I have the tests already tested.
We have a MIN_EXPR to calculate the total elements:
loop_len_34 = MIN_EXPR ;
I think "8" is already multiplied by 2?
Why do we n
Hi, Richard.
I think it can work after I analyze it.
Let's take a look the codes:
void f() {
for (int i = 0, j = 0; i < 100; i += 2, j += 4) {
x[i + 0] += 1;
x[i + 1] += 2;
y[j + 0] += 1;
y[j + 1] += 2;
y[j + 2] += 3;
y[j + 3] += 4;
}
}
For "x", each scalar iteration
钟居哲 writes:
> Hi, the .optimized dump is like this:
>
>[local count: 21045336]:
> ivtmp.26_36 = (unsigned long) &x;
> ivtmp.27_3 = (unsigned long) &y;
> ivtmp.30_6 = (unsigned long) &MEM [(void *)&y + 16B];
> ivtmp.31_10 = (unsigned long) &MEM [(void *)&y + 32B];
> ivtmp.32_14 = (u
On Wed, 2023-05-24 at 13:32 +0800, Kewen.Lin wrote:
> on 2023/5/24 06:30, Peter Bergner wrote:
> > On 5/23/23 12:24 AM, Kewen.Lin wrote:
> > > on 2023/5/23 01:31, Carl Love wrote:
> > > > The builtins were requested for use in GLibC. As of version
> > > > 2.31 they
> > > > were added as inline asm
Hi, the .optimized dump is like this:
[local count: 21045336]:
ivtmp.26_36 = (unsigned long) &x;
ivtmp.27_3 = (unsigned long) &y;
ivtmp.30_6 = (unsigned long) &MEM [(void *)&y + 16B];
ivtmp.31_10 = (unsigned long) &MEM [(void *)&y + 32B];
ivtmp.32_14 = (unsigned long) &MEM [(void *
Thanks for trying it. I'm still surprised that no multiplication
is needed though. Does the patch work for:
short x[100];
int y[200];
void f() {
for (int i = 0, j = 0; i < 100; i += 2, j += 4) {
x[i + 0] += 1;
x[i + 1] += 2;
y[j + 0] += 1;
y[j + 1] += 2;
y[j + 2] += 3;
OK for master and backports? (also a long-standing bug that didn't surface
until the new constexpr test was added)
tested on powerpc64le-linux-gnu
- 8< -
Signed-off-by: Matthias Kretz
libstdc++-v3/ChangeLog:
PR libstdc++/109949
* include/experiment
Yeah. Thanks. I have sent V14:
https://gcc.gnu.org/pipermail/gcc-patches/2023-May/619478.html
which I found there is no distinction between SLP and non-SLP.
Could you review it? I think it's more reasonable now.
Thanks.
juzhe.zh...@rivai.ai
From: Richard Sandiford
Date: 2023-05-24 22:57
To:
钟居哲 writes:
>>> Both approaches are fine. I'm not against one or the other.
>
>>> What I didn't understand was why your patch only reuses existing IVs
>>> for max_nscalars_per_iter == 1. Was it to avoid having to do a
>>> multiplication (well, really a shift left) when moving from one
>>> rgroup
On Wed, 2023-05-24 at 18:07 +0800, Lulu Cheng wrote:
>
> 在 2023/5/24 下午5:25, Xi Ruoyao 写道:
> > On Wed, 2023-05-24 at 16:47 +0800, Lulu Cheng wrote:
> > > 在 2023/5/24 下午2:45, Xi Ruoyao 写道:
> > > > On Wed, 2023-05-24 at 14:04 +0800, Lulu Cheng wrote:
> > > > > An empty struct type that is not non-tr
Forget about V13. Plz go directly review V14.
https://gcc.gnu.org/pipermail/gcc-patches/2023-May/619478.html
Thanks.
juzhe.zh...@rivai.ai
From: juzhe.zhong
Date: 2023-05-24 22:29
To: gcc-patches
CC: richard.sandiford; rguenther; Ju-Zhe Zhong
Subject: [PATCH V13] VECT: Add decrement IV iterat
From: Ju-Zhe Zhong
This patch is supporting decrement IV by following the flow designed by Richard:
(1) In vect_set_loop_condition_partial_vectors, for the first iteration of:
call vect_set_loop_controls_directly.
(2) vect_set_loop_controls_directly calculates "step" as in your patch.
If rg
> But nobody is going to understand why the INTEGER_CST case goes the
> other way.
I can add a fat comment to that effect of course. :-)
> As you say we don't have a good way to say we're doing
> this to avoid undefined behavior, but then a view-convert back would
> be a good way to indicate that
Hi. Richard. I have sent V13:
https://gcc.gnu.org/pipermail/gcc-patches/2023-May/619475.html
It looks more reasonable now.
Could you continue review it again?
Thanks.
juzhe.zh...@rivai.ai
From: Richard Sandiford
Date: 2023-05-24 22:01
To: 钟居哲
CC: gcc-patches; rguenther
Subject: Re: [PATCH V12
From: Ju-Zhe Zhong
This patch is supporting decrement IV by following the flow designed by Richard:
(1) In vect_set_loop_condition_partial_vectors, for the first iteration of:
call vect_set_loop_controls_directly.
(2) vect_set_loop_controls_directly calculates "step" as in your patch.
If rg
Oh. I just realize the follow you design is working well for vec_pack_trunk too.
Will send V13 patch soon.
Thanks.
juzhe.zh...@rivai.ai
From: 钟居哲
Date: 2023-05-24 22:10
To: richard.sandiford
CC: gcc-patches; rguenther
Subject: Re: Re: [PATCH V12] VECT: Add decrement IV iteration loop control
Bernhard,
Thanks a lot for your comments.
> On May 19, 2023, at 7:11 PM, Bernhard Reutner-Fischer
> wrote:
>
> On Fri, 19 May 2023 20:49:47 +
> Qing Zhao via Gcc-patches wrote:
>
>> GCC extension accepts the case when a struct with a flexible array member
>> is embedded into another stru
On Wed, 24 May 2023, Richard Biener wrote:
> On Wed, May 24, 2023 at 2:54 PM Alexander Monakov via Gcc-patches
> wrote:
> >
> > Explicitly say that bitwise shifts for narrow types work similar to
> > element-wise C shifts with integer promotions, which coincides with
> > OpenCL semantics.
>
>
Also, move vv8qi3 expander to a better place and enable
it with TARGET_MMX_WITH_SSE. Remove handling of V8QImode from
ix86_expand_vecop_qihi2 since all partial QI->HI vector modes expand
via ix86_expand_vecop_qihi_partial.
gcc/ChangeLog:
* config/i386/i386-expand.cc (ix86_expand_vecop_qihi2)
>> Actually, I just want to hanlde multip-rgroup for non-SLP here, I am trying
>> to avoid multiplication and I think
>> scalar multiplication (not cost too much) is fine in modern CPU.
Sorry for incorrect typo. I didn't try to avoid multiplication and I think
multiplication is fine.
juzhe.zh.
Hello,
On Wed, May 17 2023, Aldy Hernandez wrote:
> This patch encapsulates the ipa_vr internals into an API. It also
> makes it type agnostic, in preparation for upcoming changes to IPA.
>
> Interestingly, there's a 0.44% improvement to IPA-cp, which I'm sure
> we'll soak up with future changes
>> Both approaches are fine. I'm not against one or the other.
>> What I didn't understand was why your patch only reuses existing IVs
>> for max_nscalars_per_iter == 1. Was it to avoid having to do a
>> multiplication (well, really a shift left) when moving from one
>> rgroup to another? E.g.
钟居哲 writes:
>>> In other words, why is this different from what
>>>vect_set_loop_controls_directly would do?
> Oh, I see. You are confused that why I do not make multiple-rgroup vec_trunk
> handling inside "vect_set_loop_controls_directly".
>
> Well. Frankly, I just replicate the handling of ARM
Hi all,
Continuing the series of straightforward annotations, this one handles the
normal (not widening or narrowing) vector shifts.
Tests included.
Bootstrapped and tested on aarch64-none-linux-gnu and aarch64_be-none-elf.
Pushing to trunk.
Thanks,
Kyrill
gcc/ChangeLog:
PR target/9919
1 - 100 of 172 matches
Mail list logo