For VMAT_CONTIGUOUS_REVERSE, the transform code in function
vectorizable_store generates a VEC_PERM_EXPR stmt before
storing, but it's never considered in costing.
This patch is to make it consider vec_perm in costing, it
adjusts the order of transform code a bit to make it easy
to early return fo
This patch is to eventually get rid of vect_model_store_cost,
it adjusts the costing for the remaining memory access types
VMAT_CONTIGUOUS{, _DOWN, _REVERSE} by moving costing close
to the transform code. Note that in vect_model_store_cost,
there is one special handling for vectorizing a store int
This patch adjusts the cost handling on VMAT_GATHER_SCATTER
in function vectorizable_store (all three cases), then we
won't depend on vect_model_load_store for its costing any
more. This patch shouldn't have any functional changes.
gcc/ChangeLog:
* tree-vect-stmts.cc (vect_model_store_co
This patch adjusts the cost handling on VMAT_CONTIGUOUS_PERMUTE
in function vectorizable_store. We don't call function
vect_model_store_cost for it any more. It's the case of
interleaving stores, so it skips all stmts excepting for
first_stmt_info, consider the whole group when costing
first_stmt
This costing adjustment patch series exposes one issue in
aarch64 specific costing adjustment for STP sequence. It
causes the below test cases to fail:
- gcc/testsuite/gcc.target/aarch64/ldp_stp_15.c
- gcc/testsuite/gcc.target/aarch64/ldp_stp_16.c
- gcc/testsuite/gcc.target/aarch64/ldp_stp_
When making/testing patches to move costing next to the
transform code for vectorizable_store, some ICEs got
exposed when I further refined the costing handlings on
VMAT_ELEMENTWISE. The apparent cause is triggering the
assertion in rs6000 specific function for costing
rs6000_builtin_vectorization
This patch adjusts the cost handling on VMAT_ELEMENTWISE
and VMAT_STRIDED_SLP in function vectorizable_store. We
don't call function vect_model_store_cost for them any more.
Like what we improved for PR82255 on load side, this change
helps us to get rid of unnecessary vec_to_scalar costing
for so
This patch adjusts the cost handling on VMAT_LOAD_STORE_LANES
in function vectorizable_store. We don't call function
vect_model_store_cost for it any more. It's the case of
interleaving stores, so it skips all stmts excepting for
first_stmt_info, consider the whole group when costing
first_stmt_i
This patch is to simplify the costing on the case
vectorizable_scan_store without calling function
vect_model_store_cost any more.
I considered if moving the costing into function
vectorizable_scan_store is a good idea, for doing
that, we have to pass several variables down which
are only used for
This patch series is a follow up for the previous patch
series for vector load [1]. Some of associated test cases
show the benefits of this kind of structuring. Like the
one on vect load, this patch series makes costing not call
function vect_model_store_cost any more but next to the
transform.
This patch is an initial patch to move costing next to the
transform, it still adopts vect_model_store_cost for costing
but moves and duplicates it down according to the handlings
of different vect_memory_access_types or some special
handling need, hope it can make the subsequent patches easy
to re
This patch series follows Richi's suggestion at the link [1],
which suggest structuring vectorizable_load to make costing
next to the transform, in order to make it easier to keep
costing and the transform in sync. For now, it's a known
issue that what we cost can be inconsistent with what we
tran
This patch adjusts the cost handling on
VMAT_CONTIGUOUS_REVERSE in function vectorizable_load. We
don't call function vect_model_load_cost for it any more.
This change makes us not miscount some required vector
permutation as the associated test case shows.
gcc/ChangeLog:
* tree-vect-st
This patch adjusts the cost handling on VMAT_GATHER_SCATTER
in function vectorizable_load. We don't call function
vect_model_load_cost for it any more.
It's mainly for gather loads with IFN or emulated gather
loads, it follows the handlings in function
vect_model_load_cost. This patch shouldn't
This patch adjusts the cost handling on VMAT_ELEMENTWISE
and VMAT_STRIDED_SLP in function vectorizable_load. We
don't call function vect_model_load_cost for them any more.
As PR82255 shows, we don't always need a vector construction
there, moving costing next to the transform can make us only
cos
This patch adjusts the cost handling on
VMAT_CONTIGUOUS_PERMUTE in function vectorizable_load. We
don't call function vect_model_load_cost for it any more.
As the affected test case gcc.target/i386/pr70021.c shows,
the previous costing can under-cost the total generated
vector loads as for VMAT_C
This patch adjusts the cost handling on VMAT_CONTIGUOUS in
function vectorizable_load. We don't call function
vect_model_load_cost for it any more. It removes function
vect_model_load_cost which becomes useless and unreachable
now.
gcc/ChangeLog:
* tree-vect-stmts.cc (vect_model_load_co
This patch adds one extra argument cost_vec to function
vect_build_gather_load_calls, so that we can do costing
next to the tranform in vect_build_gather_load_calls.
For now, the implementation just follows the handlings in
vect_model_load_cost, it isn't so good, so placing one
FIXME for any furthe
This patch adjusts the cost handling on VMAT_INVARIANT in
function vectorizable_load. We don't call function
vect_model_load_cost for it any more.
To make the costing on VMAT_INVARIANT better, this patch is
to query hoist_defs_of_uses for hoisting decision, and add
costs for different "where" bas
This patch adjusts the cost handling on
VMAT_LOAD_STORE_LANES in function vectorizable_load. We
don't call function vect_model_load_cost for it any more.
It follows what we do in the function vect_model_load_cost,
and shouldn't have any functional changes.
gcc/ChangeLog:
* tree-vect-stm
This patch is an initial patch to move costing next to the
transform, it still adopts vect_model_load_cost for costing
but moves and duplicates it down according to the handlings
of different vect_memory_access_types, hope it can make the
subsequent patches easy to review. This patch should not
ha
The current handlings in rs6000_emit_vector_compare is a bit
complicated to me, especially after we emit vector float
comparison insn with the given code directly. So it's better
to refactor the handlings of vector integer comparison here.
This is part 4, it's to rework the handlings on GE/GEU/LE
The current handlings in rs6000_emit_vector_compare is a bit
complicated to me, especially after we emit vector float
comparison insn with the given code directly. So it's better
to refactor the handlings of vector integer comparison here.
This is part 1, it's to remove the helper function
rs6000
All kinds of vector float comparison operators have been
supported in a rtl comparison pattern as vector.md, we can
just emit an rtx comparison insn with the given comparison
operator in function rs6000_emit_vector_compare instead of
checking and handling the reverse condition cases.
This is part
All kinds of vector float comparison operators have been
supported in a rtl comparison pattern as vector.md, we can
just emit an rtx comparison insn with the given comparison
operator in function rs6000_emit_vector_compare instead of
checking and handling the reverse condition cases.
This is part
The current handlings in rs6000_emit_vector_compare is a bit
complicated to me, especially after we emit vector float
comparison insn with the given code directly. So it's better
to refactor the handlings of vector integer comparison here.
This is part 3, it's to refactor the handlings on NE.
Thi
All kinds of vector float comparison operators have been
supported in a rtl comparison pattern as vector.md, we can
just emit an rtx comparison insn with the given comparison
operator in function rs6000_emit_vector_compare instead of
checking and handling the reverse condition cases.
This is part
The current handlings in rs6000_emit_vector_compare is a bit
complicated to me, especially after we emit vector float
comparison insn with the given code directly. So it's better
to refactor the handlings of vector integer comparison here.
This is part 5, it's to refactor all the handlings of vec
The current handlings in rs6000_emit_vector_compare is a bit
complicated to me, especially after we emit vector float
comparison insn with the given code directly. So it's better
to refactor the handlings of vector integer comparison here.
This is part 2, it's to refactor the handlings on LT and
All kinds of vector float comparison operators have been
supported in a rtl comparison pattern as vector.md, we can
just emit an rtx comparison insn with the given comparison
operator in function rs6000_emit_vector_compare instead of
checking and handling the reverse condition cases.
This is part
Hi,
Following Segher's suggestion, this patch series is to rework
function rs6000_emit_vector_compare for vector float and int
in multiple steps, it's based on the previous attempts [1][2].
As mentioned in [1], the need to rework this for float is to
make a centralized place for vector float compa
This patch is to fix some non-robust split conditions in some
define_insn_and_splits, to make each of them applied on top of
the corresponding condition for define_insn part, otherwise the
splitting could perform unexpectedly.
gcc/ChangeLog:
* config/csky/csky.md (*cskyv2_adddi3, *ck801_a
This patch is to fix some non-robust split conditions in some
define_insn_and_splits, to make each of them applied on top of
the corresponding condition for define_insn part, otherwise the
splitting could perform unexpectedly.
gcc/ChangeLog:
* config/sh/sh.md (call_pcrel, call_value_pcrel
This patch is to fix some non-robust split conditions in some
define_insn_and_splits, to make each of them applied on top of
the corresponding condition for define_insn part, otherwise the
splitting could perform unexpectedly.
gcc/ChangeLog:
* config/mips/mips.md (*udivmod4, udivmod4_mips
This patch is to fix some non-robust split conditions in some
define_insn_and_splits, to make each of them applied on top of
the corresponding condition for define_insn part, otherwise the
splitting could perform unexpectedly.
gcc/ChangeLog:
* config/i386/i386.md (*add3_doubleword, *addv4
This patch is to fix some non-robust split conditions in some
define_insn_and_splits, to make each of them applied on top of
the corresponding condition for define_insn part, otherwise the
splitting could perform unexpectedly.
gcc/ChangeLog:
* config/ia64/vect.md (*vec_extractv2sf_0_le, *
This patch is to fix some non-robust split conditions in some
define_insn_and_splits, to make each of them applied on top of
the corresponding condition for define_insn part, otherwise the
splitting could perform unexpectedly.
gcc/ChangeLog:
* config/bfin/bfin.md (movdi_insn, movdf_insn):
This patch is to fix one non-robust split condition, to make
it applied on top of the corresponding condition for define_insn
part, otherwise the splitting could perform unexpectedly.
gcc/ChangeLog:
* config/arm/arm.md (*minmax_arithsi_non_canon): Fix split condition.
---
gcc/config/arm/
This patch is to fix some non-robust split conditions in some
define_insn_and_splits, to make each of them applied on top of
the corresponding condition for define_insn part, otherwise the
splitting could perform unexpectedly.
gcc/ChangeLog:
* config/alpha/alpha.md (*movtf_internal, *movt
This patch is to fix some non-robust split conditions in some
define_insn_and_splits, to make each of them applied on top of
the corresponding condition for define_insn part, otherwise the
splitting could perform unexpectedly.
gcc/ChangeLog:
* config/visium/visium.md (*add3_insn, *addsi3_
This patch is to fix some non-robust split conditions in some
define_insn_and_splits, to make each of them applied on top of
the corresponding condition for define_insn part, otherwise the
splitting could perform unexpectedly.
gcc/ChangeLog:
* config/xtensa/xtensa.md (movdi_internal, movd
This patch is to fix some non-robust split conditions in some
define_insn_and_splits, to make each of them applied on top of
the corresponding condition for define_insn part, otherwise the
splitting could perform unexpectedly.
gcc/ChangeLog:
* config/v850/v850.md (cbranchsf4, cbranchdf4,
This patch is to fix some non-robust split conditions in some
define_insn_and_splits, to make each of them applied on top of
the corresponding condition for define_insn part, otherwise the
splitting could perform unexpectedly.
gcc/ChangeLog:
* config/s390/s390.md (*cstorecc_z13): Fix spli
This patch is to fix some non-robust split conditions in some
define_insn_and_splits, to make each of them applied on top of
the corresponding condition for define_insn part, otherwise the
splitting could perform unexpectedly.
gcc/ChangeLog:
* config/m32c/cond.md (stzx_reversed_, movhicc_
This patch is to fix one non-robust split condition, to make
it applied on top of the corresponding condition for define_insn
part, otherwise the splitting could perform unexpectedly.
gcc/ChangeLog:
* config/rx/rx.md (cstoresf4): Fix split condition.
---
gcc/config/rx/rx.md | 2 +-
1 fil
Hi,
This trivial patch series is the secondary product from the previous
investigation to see how many define_insn_and_split cases where
split_condition isn't applied on top of condition for define_insn
part and doesn't contain it, when there were some discussions on
whether we should warn for emp
This patch is to fix some non-robust split conditions in some
define_insn_and_splits, to make each of them applied on top of
the corresponding condition for define_insn part, otherwise the
splitting could perform unexpectedly.
gcc/ChangeLog:
* config/frv/frv.md (*abssi2_internal, *minmax_
gcc/ChangeLog:
* config/cris/cris.md (*addi_reload): Fix empty split condition.
---
gcc/config/cris/cris.md | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/gcc/config/cris/cris.md b/gcc/config/cris/cris.md
index 7de0ec63fcf..d5a3c703a83 100644
--- a/gcc/config/cris/cri
gcc/ChangeLog:
* config/sparc/sparc.md (*snedi_zero_vis3,
*neg_snedi_zero_subxc, *plus_snedi_zero,
*plus_plus_snedi_zero, *minus_snedi_zero,
*minus_minus_snedi_zero): Fix empty split condition.
---
gcc/config/sparc/sparc.md | 12 ++--
1 file changed, 6 inse
gcc/ChangeLog:
* config/sh/sh.md (doloop_end_split): Fix empty split condition.
---
gcc/config/sh/sh.md | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/gcc/config/sh/sh.md b/gcc/config/sh/sh.md
index e3af9ae21c1..93ee7c9a7de 100644
--- a/gcc/config/sh/sh.md
+++ b/gcc/c
gcc/ChangeLog:
* config/or1k/or1k.md (*movdi): Fix empty split condition.
---
gcc/config/or1k/or1k.md | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/gcc/config/or1k/or1k.md b/gcc/config/or1k/or1k.md
index eb94efba0e4..495b3e277ba 100644
--- a/gcc/config/or1k/or1k.md
+
gcc/ChangeLog:
* config/mips/mips.md (, bswapsi2, bswapdi2): Fix empty
split condition.
---
gcc/config/mips/mips.md | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/gcc/config/mips/mips.md b/gcc/config/mips/mips.md
index eef3cfd50a8..455b9b802f6 100644
---
gcc/ChangeLog:
* config/h8300/combiner.md (*andsi3_lshiftrt_n_sb): Fix empty split
condition.
---
gcc/config/h8300/combiner.md | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/gcc/config/h8300/combiner.md b/gcc/config/h8300/combiner.md
index 20e19da0419..e31bd50
gcc/ChangeLog:
* config/arm/vfp.md (no_literal_pool_df_immediate,
no_literal_pool_sf_immediate): Fix empty split condition.
---
gcc/config/arm/vfp.md | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/gcc/config/arm/vfp.md b/gcc/config/arm/vfp.md
index f97af92
gcc/ChangeLog:
* config/i386/i386.md (*load_tp_x32_zext, *add_tp_x32_zext,
*tls_dynamic_gnu2_combine_32): Fix empty split condition.
* config/i386/sse.md (*_pmovmskb_lt,
*_pmovmskb_zext_lt, *sse2_pmovmskb_ext_lt,
*_pblendvb_lt): Likewise.
---
gcc/config/i38
gcc/ChangeLog:
* config/m68k/m68k.md (*zero_extend_inc, *zero_extend_dec,
*zero_extendsidi2): Fix empty split condition.
---
gcc/config/m68k/m68k.md | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/gcc/config/m68k/m68k.md b/gcc/config/m68k/m68k.md
index 59
gcc/ChangeLog:
* config/arc/arc.md (*bbit_di): Fix empty split condition.
---
gcc/config/arc/arc.md | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/gcc/config/arc/arc.md b/gcc/config/arc/arc.md
index 7a52551eef5..a03840c4c36 100644
--- a/gcc/config/arc/arc.md
+++ b/gcc
As Segher suggested, this patch is to emit the error message
if the split condition of define_insn_and_split is empty while
the insn condition isn't.
gcc/ChangeLog:
* gensupport.c (process_rtx): Emit error message for empty
split condition in define_insn_and_split while the insn
Hi all,
define_insn_and_split should avoid to use empty split condition
if the condition for define_insn isn't empty, otherwise it can
sometimes result in unexpected consequence, since the split
will always be done even if the insn condition doesn't hold.
To avoid forgetting to add "&& 1" onto sp
59 matches
Mail list logo