On Wed, 28 Jun 2023, haochen.jiang wrote:
> On Linux/x86_64,
>
> dd86a5a69cbda40cf76388a65d3317c91cb2b501 is the first bad commit
> commit dd86a5a69cbda40cf76388a65d3317c91cb2b501
> Author: Richard Biener
> Date: Thu Jun 22 11:40:46 2023 +0200
>
> tree-optimization/96208 - SLP of non-grou
On Wed, Jun 28, 2023 at 3:56 AM Hongyu Wang wrote:
>
> > I don't think this is desirable. If we inline something with different
> > ISAs, we get some strange mix of ISAs when the function is inlined.
> > OTOH - we already inline with mismatched tune flags if the function is
> > marked with always_
On Tue, Jun 27, 2023 at 11:31 PM Eugene Rozenfeld via Gcc-patches
wrote:
>
> cc1, cc1plus, and lto built during STAGEautoprofile need to be built with
> debug info since they are used to build target libs. -gtoggle was
> turning off debug info for this stage.
>
> create_gcov should be passed prev-
The difference between v1 and v2 is the compact mask generation:
v1 :
+rtx
+rvv_builder::compact_mask () const
+{
+ /* Use the container mode with SEW = 8 and LMUL = 1. */
+ unsigned container_size
+= MAX (CEIL (npatterns (), 8), BYTES_PER_RISCV_VECTOR.to_constant () / 8);
+ machine_mode
I mean the difference between v1 and v2 patch
On Wed, Jun 28, 2023 at 12:09 PM Jeff Law wrote:
>
>
>
> On 6/27/23 21:16, Kito Cheng wrote:
> > Do you mind giving some comments about what the difference between the
> > two versions?
> And I'd like a before/after assembly code with the example in t
Tested x86_64-pc-linux-gnu, applying to trunk.
-- 8< --
Inherited constructors are like constructor clones; they don't exist from
the language perspective, so they should copy the attributes in the same
way. But it doesn't make sense to copy alias or ifunc attributes in either
case. Unlike hand
Consider the following complicate case:
#define TEST_TYPE(TYPE1, TYPE2)\
__attribute__ ((noipa)) void vwadd_##TYPE1_##TYPE2 ( \
TYPE1 *__restrict dst, TYPE1 *__restrict dst2, TYPE1 *__restrict dst3, \
TYPE1 *__res
On 6/27/23 21:16, Kito Cheng wrote:
Do you mind giving some comments about what the difference between the
two versions?
And I'd like a before/after assembly code with the example in the commit
message. I didn't see the same behavior when I tried it earlier today
and ran out of time to dig
Tested x86_64-pc-linux-gnu, applying to trunk.
-- 8< --
P2768 allows static_cast from void* to ob* in constant evaluation if the
pointer does in fact point to an object of the appropriate type.
cxx_fold_indirect_ref already does the work of finding such an object if it
happens to be a subobject r
Tested x86_64-pc-linux-gnu, applying to trunk.
-- 8< --
As with c++23, we want to run { target c++26 } tests even though it isn't
part of the default std_list.
C++17 with Concepts TS is no longer an interesting target configuration.
And bump the impcx target to use C++26 mode instead of 23.
gc
On Wed, Jun 28, 2023 at 3:32 AM Roger Sayle wrote:
>
>
> Doh! Wrong patch...
> Roger
> --
>
> From: Roger Sayle
> Sent: 27 June 2023 20:28
> To: 'gcc-patches@gcc.gnu.org'
> Cc: 'Uros Bizjak' ; 'Hongtao Liu'
> Subject: [x86 PATCH] Tweak ix86_expand_int_compare to use PTEST for vector
> equality.
I have commented in commit log:
before this patch:
The mask is:
.LC1:
.byte 68 > 0b01000100
However, this is incorrect for RVV since RVV always uses 1-bit compact mask,
now after this patch:
.LC1:
.byte 10 > 0b1010
juzhe.zh...@rivai.ai
From: Kito Cheng
Date: 2023-
Do you mind giving some comments about what the difference between the
two versions?
On Wed, Jun 28, 2023 at 11:14 AM juzhe.zh...@rivai.ai
wrote:
>
> This patch is the critical patch for following patches since it is a bug
> which I already address in rvv-next.
>
> __
LGTM with a minor comment.
> Currently, vfwadd.wv is the pattern with (set (reg) (float_extend:(reg))
> which makes
it's minor so you can just go commit after the fix: this should be
(set (plus (reg) (float_extend:(reg)))
> combine pass faile to combine.
>
> change RTL format of vfwadd.wv -
This patch is the critical patch for following patches since it is a bug which
I already address in rvv-next.
juzhe.zh...@rivai.ai
From: Juzhe-Zhong
Date: 2023-06-28 09:59
To: gcc-patches
CC: kito.cheng; kito.cheng; palmer; palmer; jeffreyalaw; rdapp.gcc; Juzhe-Zhong
Subject: [PATCH V2] RISC-
Currently, vfwadd.wv is the pattern with (set (reg) (float_extend:(reg)) which
makes
combine pass faile to combine.
change RTL format of vfwadd.wv --> (set (float_extend:(reg) (reg)) so that
combine
PASS can combine.
gcc/ChangeLog:
* config/riscv/riscv-vector-builtins-bases.cc: Ada
It seems because of canonical form of RTL, right?
LGTM, but plz add some more comments about the reason into the commit log.
On Wed, Jun 28, 2023 at 11:00 AM Juzhe-Zhong wrote:
>
> gcc/ChangeLog:
>
> * config/riscv/riscv-vector-builtins-bases.cc: Adapt expand.
> * config/riscv/ve
gcc/ChangeLog:
* config/riscv/riscv-vector-builtins-bases.cc: Adapt expand.
* config/riscv/vector.md (@pred_single_widen_):
Remove.
(@pred_single_widen_add): New pattern.
(@pred_single_widen_sub): New pattern.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/r
The testcase fails with --with-arch=native build on cascadelake, here
is the patch to adjust it
gcc/testsuite/ChangeLog:
* gcc.target/i386/mvc17.c: Add -march=x86-64 to dg-options.
---
gcc/testsuite/gcc.target/i386/mvc17.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/gcc/t
This bug blocks the following patches.
GCC doesn't know RVV is using compact mask model.
Consider this following case:
#define N 16
int
main ()
{
int8_t mask[N] = {0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1};
int8_t out[N] = {0};
for (int8_t i = 0; i < N; ++i)
if (mask[i])
ou
> I don't think this is desirable. If we inline something with different
> ISAs, we get some strange mix of ISAs when the function is inlined.
> OTOH - we already inline with mismatched tune flags if the function is
> marked with always_inline.
Previously ix86_can_inline_p has
if (((caller_opts->
On Tue, Jun 27, 2023 at 8:56 AM Robin Dapp via Gcc-patches
wrote:
>
> > You can put it into the original one.
>
> Bootstrap and testsuite run were successful.
> I'm going to push the attached, thanks.
I am reducing a bug report which I think will be fixed by this change
(PR 110444). I will double
On Linux/x86_64,
dd86a5a69cbda40cf76388a65d3317c91cb2b501 is the first bad commit
commit dd86a5a69cbda40cf76388a65d3317c91cb2b501
Author: Richard Biener
Date: Thu Jun 22 11:40:46 2023 +0200
tree-optimization/96208 - SLP of non-grouped loads
caused
FAIL: gcc.dg/vect/slp-46.c -flto -ffat-l
Hi, Alexandre,
Thanks a lot for the work. I think that this will be a valuable feature to be
added for GCC’s security functionality.
I have several questions on this patch:
1. The implementation of register scrubbing, -fzero-call-used-regs, is to
insert the register zeroing sequence in th
cc1, cc1plus, and lto built during STAGEautoprofile need to be built with
debug info since they are used to build target libs. -gtoggle was
turning off debug info for this stage.
create_gcov should be passed prev-gcc/cc1, prev-gcc/cc1plus, and prev-gcc/lto
instead of stage1-gcc/cc1, stage1-gcc/cc1
On Tue, Jun 27, 2023 at 7:22 PM Roger Sayle wrote:
>
>
> This patch fixes some very odd (unanticipated) code generation by
> compare_by_pieces with -m32 -mavx, since the recent addition of the
> cbranchoi4 pattern. The issue is that cbranchoi4 is available with
> TARGET_AVX, but cbranchti4 is cur
On Tue, Jun 27, 2023 at 8:40 PM Roger Sayle wrote:
>
>
> This patch fixes the FAIL of gcc.target/i386/pr78794.c on ia32, which
> is caused by minor STV rtx_cost differences with -march=silvermont.
> It turns out that generic tuning results in pandn, but the lack of
> accurate parameterization for
Doh! Wrong patch...
Roger
--
From: Roger Sayle
Sent: 27 June 2023 20:28
To: 'gcc-patches@gcc.gnu.org'
Cc: 'Uros Bizjak' ; 'Hongtao Liu'
Subject: [x86 PATCH] Tweak ix86_expand_int_compare to use PTEST for vector
equality.
Hi Uros,
Hopefully Hongtao will approve my patch to support SUBREG co
Hi Uros,
Hopefully Hongtao will approve my patch to support SUBREG conversions
in STV https://gcc.gnu.org/pipermail/gcc-patches/2023-June/622706.html
but for some of the examples described in the above post (and its test
case), I've also come up with an alternate/complementary/supplementa
Hi Paul,
this is much better now.
I have only a minor comment left: in the calculation of the
size of a character string you are using an intermediate
gfc_array_index_type, whereas I have learned to use
gfc_charlen_type_node now, which seems like the natural
type here.
OK for trunk, and thanks
On Wed, 28 Jun 2023 at 00:05, Richard Sandiford
wrote:
>
> Prathamesh Kulkarni writes:
> > Hi Richard,
> > Sorry I forgot to commit this patch, which you had approved in:
> > https://gcc.gnu.org/pipermail/gcc-patches/2023-April/615308.html
> >
> > Just for context for the following test:
> > svin
On 6/27/23 1:52 PM, Pat Haugen via Gcc-patches wrote:
Updated from prior version to address review comments (update
rs6000_rtx_cost,
update scan strings of mod-1.c/mod-2.c)l.
Disable generation of scalar modulo instructions.
It was recently discovered that the scalar modulo instructions can su
Updated from prior version to address review comments (update
rs6000_rtx_cost,
update scan strings of mod-1.c/mod-2.c)l.
Disable generation of scalar modulo instructions.
It was recently discovered that the scalar modulo instructions can suffer
noticeable performance issues for certain input va
This patch fixes the FAIL of gcc.target/i386/pr78794.c on ia32, which
is caused by minor STV rtx_cost differences with -march=silvermont.
It turns out that generic tuning results in pandn, but the lack of
accurate parameterization for COMPARE in compute_convert_gain combined
with small differences
Prathamesh Kulkarni writes:
> Hi Richard,
> Sorry I forgot to commit this patch, which you had approved in:
> https://gcc.gnu.org/pipermail/gcc-patches/2023-April/615308.html
>
> Just for context for the following test:
> svint32_t f_s32(int32x4_t x)
> {
> return svdupq_s32 (x[0], x[1], x[2], x[
On Jun 22, 2023, at 10:35 PM, Alexandre Oliva wrote:
>
> This patch documents a glitch in gcc.misc-tests/outputs.exp: it checks
> whether the linker is GNU ld, and uses that to decide whether to
> expect collect2 to create .ld1_args files under -save-temps, but
> collect2 bases that decision on w
On 6/27/23 12:24, Jan Hubicka wrote:
On 6/27/23 09:19, Jan Hubicka wrote:
Hi,
as shown in the testcase (which would eventually be useful for
optimizing std::vector's push_back), ipa-prop can use context dependent ranger
queries for better value range info.
Bootstrapped/regtested x86_64-linux,
On Tue, Jun 27, 2023 at 12:14 AM Richard Biener via Gcc-patches
wrote:
>
> On Tue, Jun 27, 2023 at 5:26 AM Andrew Pinski via Gcc-patches
> wrote:
> >
> > The manual references asm goto as being implicitly volatile already
> > and that was done when asm goto could not have outputs. When outputs
>
This patch fixes some very odd (unanticipated) code generation by
compare_by_pieces with -m32 -mavx, since the recent addition of the
cbranchoi4 pattern. The issue is that cbranchoi4 is available with
TARGET_AVX, but cbranchti4 is currently conditional on TARGET_64BIT
which results in the odd beh
> Arg, once again, I'm sorry. I don't know how this happened. It would
> be trivial to fix it but since
>
> commit 4a48a38fa99f067b8f3a3d1a5dc7a1e602db351f
> Author: Eric Botcazou
> Date: Wed Jun 21 18:19:36 2023 +0200
>
> ada: Fix build of GNAT tools
>
> the build with Ada included fai
>
> On 6/27/23 09:19, Jan Hubicka wrote:
> > Hi,
> > as shown in the testcase (which would eventually be useful for
> > optimizing std::vector's push_back), ipa-prop can use context dependent
> > ranger
> > queries for better value range info.
> >
> > Bootstrapped/regtested x86_64-linux, OK?
>
gcc's documentatation mentions that all basic asm blocks are always volatile,
yet the parser fails to account for this by only ever setting
volatile_p to true
if the volatile qualifier is found. This patch fixes this by adding a
special case check for extended_p before finish_asm_stmt is called
>F
> You can put it into the original one.
Bootstrap and testsuite run were successful.
I'm going to push the attached, thanks.
Regards
Robin
diff --git a/gcc/match.pd b/gcc/match.pd
index 33ccda3e7b6..83bcefa914b 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -7454,10 +7454,12 @@ DEFINE_INT_AND_
> On 27 Jun 2023, at 16:31, Marek Polacek via Gcc-patches
> wrote:
>
> On Tue, Jun 27, 2023 at 01:39:16PM +0200, Martin Jambor wrote:
>> Hello,
>>
>> On Tue, May 16 2023, Marek Polacek via Gcc-patches wrote:
>>> As promised in the --enable-host-pie patch, this patch adds another
>>> configur
Hi,
Based on the discussion so far and further consideration, the following is my
plan for this new attribute:
1. The syntax of the new attribute will be:
__attribute__((counted_by (count_field_id)));
In the above, count_field_id is the identifier for the field that carries the
number
of el
On Tue, Jun 27, 2023 at 01:39:16PM +0200, Martin Jambor wrote:
> Hello,
>
> On Tue, May 16 2023, Marek Polacek via Gcc-patches wrote:
> > As promised in the --enable-host-pie patch, this patch adds another
> > configure option, --enable-host-bind-now, which adds -z now when linking
> > the compile
On Tue, Jun 27 2023, Jan Hubicka wrote:
> Hi,
> as shown in the testcase (which would eventually be useful for
> optimizing std::vector's push_back), ipa-prop can use context dependent ranger
> queries for better value range info.
>
> Bootstrapped/regtested x86_64-linux, OK?
>
> Honza
>
> gcc/Chang
GCC doesn't known RVV is using compact mask model.
Consider this following case:
#define N 16
int
main ()
{
int8_t mask[N] = {0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1};
int8_t out[N] = {0};
for (int8_t i = 0; i < N; ++i)
if (mask[i])
out[i] = i;
for (int8_t i = 0; i < N; +
On 6/27/23 09:19, Jan Hubicka wrote:
Hi,
as shown in the testcase (which would eventually be useful for
optimizing std::vector's push_back), ipa-prop can use context dependent ranger
queries for better value range info.
Bootstrapped/regtested x86_64-linux, OK?
Quick question.
When you call
Hi,
as shown in the testcase (which would eventually be useful for
optimizing std::vector's push_back), ipa-prop can use context dependent ranger
queries for better value range info.
Bootstrapped/regtested x86_64-linux, OK?
Honza
gcc/ChangeLog:
PR middle-end/110377
* ipa-prop.cc
From: Eric Botcazou
This may cause the type of the RESULT_DECL of a function which returns by
invisible reference to be turned into a reference type twice.
gcc/ada/
* gcc-interface/trans.cc (Subprogram_Body_to_gnu): Add guard to the
code turning the type of the RESULT_DECL into
From: Eric Botcazou
gcc/ada/
* gcc-interface/trans.cc (Case_Statement_to_gnu): Rename boolean
constant and use From_Conditional_Expression flag for its value.
Tested on x86_64-pc-linux-gnu, committed on master.
---
gcc/ada/gcc-interface/trans.cc | 8 +++-
1 file changed, 3
From: Eric Botcazou
Unlike for loop parameter specifications where it references an index, the
defining identifier references an element in them.
gcc/ada/
* sem_ch12.adb (Check_Generic_Actuals): Check the component type
of constants and variables of an array type.
(Copy_
From: Eric Botcazou
This streamlines the expansion of case expressions by not wrapping them in
an Expression_With_Actions node when the type is not by copy, which avoids
the creation of a temporary and the associated finalization issues.
That's the same strategy as the one used for the expansion
From: Eric Botcazou
Sem_Ch5 contains an entire machinery to deal with finalization actions and
secondary stack releases around iterator loops, so this removes a recent
fix that was made in a narrower case and instead refines the condition under
which this machinery is triggered.
As a side effect
From: Viljar Indus
All N_Aggregate nodes were printed with parentheses "()". However
the new container aggregates (homogeneous N_Aggregate nodes) should
be printed with brackets "[]".
gcc/ada/
* sprint.adb (Print_Node_Actual): Print homogeneous N_Aggregate
nodes with brackets.
From: Claire Dross
Item might not be entirely initialized after a call to Get_Line.
gcc/ada/
* libgnat/a-textio.ads (Get_Line): Use Relaxed_Initialization on
the Item parameter of Get_Line.
Tested on x86_64-pc-linux-gnu, committed on master.
---
gcc/ada/libgnat/a-textio.ads |
From: Eric Botcazou
This deals with nested instantiations in package bodies.
gcc/ada/
* sem_ch12.adb (Scope_Within_Body_Or_Same): New predicate.
(Check_Actual_Type): Take into account packages nested in bodies
to compute the enclosing scope by means of
Scope_With
From: Eric Botcazou
This deals with discriminants of types declared in package bodies.
gcc/ada/
* sem_ch12.adb (Check_Private_View): Also check the type of
visible discriminants in record and concurrent types.
Tested on x86_64-pc-linux-gnu, committed on master.
---
gcc/ada/se
From: Viljar Indus
Ensure that that container aggregate expressions are expanded as
such and not as records even if the type of the expression is a
record.
gcc/ada/
* exp_aggr.adb (Expand_N_Aggregate): Ensure that container
aggregate expressions do not get expanded as records bu
Hi Richard,
Sorry I forgot to commit this patch, which you had approved in:
https://gcc.gnu.org/pipermail/gcc-patches/2023-April/615308.html
Just for context for the following test:
svint32_t f_s32(int32x4_t x)
{
return svdupq_s32 (x[0], x[1], x[2], x[3]);
}
-O3 -mcpu=generic+sve generates foll
Hello,
On Tue, May 16 2023, Marek Polacek via Gcc-patches wrote:
> As promised in the --enable-host-pie patch, this patch adds another
> configure option, --enable-host-bind-now, which adds -z now when linking
> the compiler executables in order to extend hardening. BIND_NOW with RELRO
> allows t
Hi Harald,
Let's try again :-)
OK for trunk?
Regards
Paul
Fortran: Enable class expressions in structure constructors [PR49213]
2023-06-27 Paul Thomas
gcc/fortran
PR fortran/49213
* expr.cc (gfc_is_ptr_fcn): Remove reference to class_pointer.
* resolve.cc (resolve_assoc_var): Call gfc_is_
On Tue, 27 Jun 2023, Robin Dapp wrote:
> > so I suggest to do a similar VECTOR_MODE_P check and your original test.
> > So
> >
> > && (!VECTOR_MODE_P (TYPE_MODE (newtype))
> > || target_supports_op_p (newtype, op, optab_default))
> >
> > OK with that change.
>
> Separate patch o
> so I suggest to do a similar VECTOR_MODE_P check and your original test.
> So
>
> && (!VECTOR_MODE_P (TYPE_MODE (newtype))
> || target_supports_op_p (newtype, op, optab_default))
>
> OK with that change.
Separate patch or into the original one? We needed element_mode because
T
On Tue, 27 Jun 2023, Robin Dapp wrote:
> > Yeah, the optab should already have the fallback of WIDENing here?
> > So why does that fail?
>
> We reach
> if (CLASS_HAS_WIDER_MODES_P (mclass))
> which returns false because mclass == MODE_VECTOR_FLOAT.
> CLASS_HAS_WIDER_MODES_P only handles non-vect
On Tue, Jun 27, 2023 at 11:45:33AM +0200, Richard Biener wrote:
> The following makes sure that using TYPE_PRECISION on VECTOR_TYPE
> ICEs when tree checking is enabled. This should avoid wrong-code
> in cases like PR110182 and instead ICE.
>
> It also introduces a TYPE_PRECISION_RAW accessor and
The following makes sure that using TYPE_PRECISION on VECTOR_TYPE
ICEs when tree checking is enabled. This should avoid wrong-code
in cases like PR110182 and instead ICE.
It also introduces a TYPE_PRECISION_RAW accessor and adjusts
places I found that are eligible to use that.
Bootstrapped and t
> Yeah, the optab should already have the fallback of WIDENing here?
> So why does that fail?
We reach
if (CLASS_HAS_WIDER_MODES_P (mclass))
which returns false because mclass == MODE_VECTOR_FLOAT.
CLASS_HAS_WIDER_MODES_P only handles non-vector classes?
Same for FOR_EACH_WIDER_MODE that follows.
On Mon, Jun 26, 2023 at 4:36 AM Hongyu Wang wrote:
>
> Hi,
>
> For function with different target attributes, current logic rejects to
> inline the callee when any arch or tune is mismatched. Relax the
> condition to honor just prefer_vecotr_width_type and other flags that
> may cause safety issue
Thanks so much! Richi, I am gonna open a BUG that I won't forget this issue.
And I am gonna go ahead on LEN_MASK_GATHER_LOAD/LEN_MASK_SCATTER_STORE:
This is the first patch:
https://gcc.gnu.org/pipermail/gcc-patches/2023-June/622824.html
which is only adding optabs && internal_fn and documents (N
On Tue, 27 Jun 2023, juzhe.zh...@rivai.ai wrote:
> Hi, Richi. After reading your emails.
>
> Is that correct that I put supporting LEN_MASK_STORE into SCCVN on hold for
> now ?
>
> Go ahead to the next RVV auto-vectorization support patterns in
> middle-end (for example I sent to add optabs a
Hi,
I am sending a revised patch, now with different tests for N64/N32 and O32 ABIs.
For the O32 ABI, I've skipped the -O0 and -Os pipelines, considering there is a
difference between exact offsets for store instructions (the registers used
remain
the same).
Skipping -flto isn't really necessary,
Hi, Richi. After reading your emails.
Is that correct that I put supporting LEN_MASK_STORE into SCCVN on hold for now
?
Go ahead to the next RVV auto-vectorization support patterns in middle-end (for
example I sent to add optabs and internal fn for LEN_MASK_GATHER_LOAD).
Then, after I finish a
On Tue, 27 Jun 2023, Robin Dapp wrote:
> > Why does the expander not have a fallback here? If we put up
> > restrictions like this like we do for vector operations (after
> > vector lowering!), we need to document this. Your check covers
> > more than just FP16 types as well which I think is und
On Tue, 27 Jun 2023, juzhe.zh...@rivai.ai wrote:
> Hi, Richi.
> After several tries, I found a case that is "close to" have CSE opportunity
> in RVV for VL vectors:
>
> void __attribute__((noinline,noclone))
> foo (uint16_t *out, uint16_t *res)
> {
> int mask[] = { 0, 1, 0, 1, 0, 1, 0, 1 };
>
On Tue, 27 Jun 2023, juzhe.zh...@rivai.ai wrote:
> Hi, Richi.
>
> >> Does RVV allow MASK_STORE from
> >> intrinsics?
> No, RVV didn't use any internal_fn in intrinsics.
>
> >>with disabling complete unrolling -fdisable-tree-cunrolli both
> >>loops should get vectorized, hopefully not iterating(?
Hello All:
This patch improves code sinking pass to sink statements before call to reduce
register pressure.
Review comments are incorporated.
For example :
void bar();
int j;
void foo(int a, int b, int c, int d, int e, int f)
{
int l;
l = a + b + c + d +e + f;
if (a != 5)
{
bar(
Hi, Richi.
After several tries, I found a case that is "close to" have CSE opportunity in
RVV for VL vectors:
void __attribute__((noinline,noclone))
foo (uint16_t *out, uint16_t *res)
{
int mask[] = { 0, 1, 0, 1, 0, 1, 0, 1 };
int i;
for (i = 0; i < 8; ++i)
{
if (mask[i])
Hi, Richi.
>> Does RVV allow MASK_STORE from
>> intrinsics?
No, RVV didn't use any internal_fn in intrinsics.
>>with disabling complete unrolling -fdisable-tree-cunrolli both
>>loops should get vectorized, hopefully not iterating(?) Then
>>we end up with the overlapping CSE opportunity, no? May
On Tue, 27 Jun 2023, juzhe.zh...@rivai.ai wrote:
> Hi, Richi.
>
> When I try vector_cst_encoded_nelts (mask), the testcase I append is 2 but
> actual actual nunits is 8 in that case,
> then I failed to walk all elements analysis.
>
> >> The most simple thing would be to make this all conditiona
Richard Sandiford writes:
>> - VTYPE x, y, out;
>> + VTYPE x, y;
>> + WTYPE out;
>> type diff;
>> loop i in range:
>> S1 diff = x[i] - y[i]
>> S2 out[i] = ABS_EXPR ;
>>
>> - where 'type' is a integer and 'VTYPE' is a vector of integers
>> - the same size as
On Tue, Jun 27, 2023 at 3:28 PM Hongtao Liu wrote:
>
> On Tue, Jun 27, 2023 at 3:20 PM Richard Biener via Gcc-patches
> wrote:
> >
> > On Tue, Jun 27, 2023 at 7:38 AM liuhongt wrote:
> > >
> > > At the rtl level, we cannot guarantee that the maskstore is not optimized
> > > to other full-memory
> Why does the expander not have a fallback here? If we put up
> restrictions like this like we do for vector operations (after
> vector lowering!), we need to document this. Your check covers
> more than just FP16 types as well which I think is undesirable.
I'm not sure I follow. What would we
Hi, Richi.
When I try vector_cst_encoded_nelts (mask), the testcase I append is 2 but
actual actual nunits is 8 in that case,
then I failed to walk all elements analysis.
>> The most simple thing would be to make this all conditional
>> to constant TYPE_VECTOR_SUBPARTS. Which is also why I was
On Jun 26 2023, Andrew Pinski via Gcc-patches wrote:
> diff --git a/gcc/gimplify.cc b/gcc/gimplify.cc
> index 0e24b915b8f..dc6a00e8bd9 100644
> --- a/gcc/gimplify.cc
> +++ b/gcc/gimplify.cc
> @@ -6935,7 +6935,12 @@ gimplify_asm_expr (tree *expr_p, gimple_seq *pre_p,
> gimple_seq *post_p)
>
On Tue, 27 Jun 2023, juzhe.zh...@rivai.ai wrote:
> From: Ju-Zhe Zhong
>
> Hi, Richi.
>
> I tried to understand your last email and to refactor the do-while loop using
> VECTOR_CST_NELTS.
>
> This patch works fine for LEN_MASK_STORE and compiler can CSE redundant store.
> I have appended testc
On Tue, Jun 27, 2023 at 3:20 PM Richard Biener via Gcc-patches
wrote:
>
> On Tue, Jun 27, 2023 at 7:38 AM liuhongt wrote:
> >
> > At the rtl level, we cannot guarantee that the maskstore is not optimized
> > to other full-memory accesses, as the current implementations are equivalent
> > in terms
On Tue, Jun 27, 2023 at 8:30 AM Tejas Belagod wrote:
>
>
>
>
>
> From: Richard Biener
> Date: Monday, June 26, 2023 at 2:23 PM
> To: Tejas Belagod
> Cc: gcc-patches@gcc.gnu.org
> Subject: Re: [RFC] GNU Vector Extension -- Packed Boolean Vectors
>
> On Mon, Jun 26, 2023 at 8:24 AM Tejas Belagod
On Tue, Jun 27, 2023 at 7:38 AM Richard Sandiford via Gcc-patches
wrote:
>
> I have a patch that adds braced initializers to a GTY structure.
> gengtype didn't accept that, because it parsed the "{ ... }" in
> " = { ... };" as the end of a statement (as "{ ... }" would be in
> a function definitio
On Tue, Jun 27, 2023 at 7:38 AM liuhongt wrote:
>
> At the rtl level, we cannot guarantee that the maskstore is not optimized
> to other full-memory accesses, as the current implementations are equivalent
> in terms of pattern, to solve this potential problem, this patch refines
> the pattern of t
On Tue, Jun 27, 2023 at 5:26 AM Andrew Pinski via Gcc-patches
wrote:
>
> The manual references asm goto as being implicitly volatile already
> and that was done when asm goto could not have outputs. When outputs
> were added to `asm goto`, only asm goto without outputs were still being
> marked as
On Tue, 27 Jun 2023, Robin Dapp wrote:
> > Can you push the element_mode change separately please?
>
> OK.
>
> > I'd like to hear more reasoning of why target_supports_op_p is wanted
> > here. Doesn't target_supports_op_p return false if this is for example
> > a soft-fp target? So if at all,
Ack, thanks Juzhe.
Pan
From: juzhe.zh...@rivai.ai
Sent: Tuesday, June 27, 2023 3:00 PM
To: Li, Pan2 ; gcc-patches
Cc: Kito.cheng ; Li, Pan2 ; Wang,
Yanzhang ; jeffreyalaw
Subject: Re: [PATCH v1] RISC-V: Allow rounding mode control for RVV
floating-point add
LGTM.
You can go ahead to impleme
94 matches
Mail list logo