From: Pan Li
This patch would like to combine the vec_duplicate + vdiv.vv to the
vdiv.vx. From example as below code. The related pattern will depend
on the cost of vec_duplicate from GR2VR. Then the late-combine will
take action if the cost of GR2VR is zero, and reject the combination
if the
From: Pan Li
Add asm dump check test for vec_duplicate + vdiv.vv combine to vdiv.vx,
with the GR2VR cost is 0, 2 and 15.
The below test suites are passed for this patch.
* The rv64gcv fully regression test.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i16.c: Add a
From: Pan Li
Add asm dump check test for vec_duplicate + vdiv.vv combine to vdiv.vx,
with the GR2VR cost is 0, 1 and 2.
The below test suites are passed for this patch.
* The rv64gcv fully regression test.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i16.c: Add as
From: Pan Li
Some existing vdiv related test need some adjust for the
asm check.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/binop/vdiv-rv32gcv-nofm.c: Adjust
the asm check for vdiv.
* gcc.target/riscv/rvv/autovec/binop/vdiv-rv32gcv.c: Ditto.
* gcc.ta
From: Pan Li
This patch would like to introduce the combine of vec_dup + vdiv.vv into
vdiv.vx on the cost value of GR2VR. The late-combine will take place if
the cost of GR2VR is zero, or reject the combine if non-zero like 1, 15
in test. There will be two cases for the combine:
Case 0:
| .
From: Pan Li
Inspired by the avg_ceil patches, notice there were even more
lines too long from autovec.md. So fix that format issues.
gcc/ChangeLog:
* config/riscv/autovec.md: Fix line too long for sorts
of pattern.
Signed-off-by: Pan Li
---
gcc/config/riscv/autovec.md | 54
From: Pan Li
Add asm and run testcase for avg_ceil vaadd implementation.
The below test suites are passed for this patch series.
* The rv64gcv fully regression test.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/avg.h: Add test helper macros.
* gcc.target/riscv/rvv/au
From: Pan Li
The avg_ceil has the rounding mode towards +inf, while the
vaadd.vv has the rnu which totally match the sematics. From
RVV spec, the fixed vaadd.vv with rnu,
roundoff_signed(v, d) = (signed(v) >> d) + r
r = v[d - 1]
For vaadd, d = 1, then we have
roundoff_signed(v, 1) = (signed(v
From: Pan Li
Some existing avg_floor test need updated due to change to
leverage vaadd.vv directly.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/vls/avg-4.c: Update asm check
to vaadd.
* gcc.target/riscv/rvv/autovec/vls/avg-5.c: Ditto.
* gcc.target/ris
From: Pan Li
Similar to the avg_floor, the avg_ceil has the rounding mode
towards +inf, while the vaadd.vv has the rnu which totally match
the sematics. From RVV spec, the fixed vaadd.vv with rnu,
roundoff_signed(v, d) = (signed(v) >> d) + r
r = v[d - 1]
For vaadd, d = 1, then we have
roundof
From: Pan Li
Add asm dump check test for vec_duplicate + vmul.vv combine to vmul.vx,
with the GR2VR cost is 0, 1 and 2.
The below test suites are passed for this patch.
* The rv64gcv fully regression test.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i16.c: Add as
From: Pan Li
Add asm dump check test for vec_duplicate + vmul.vv combine to vmul.vx,
with the GR2VR cost is 0, 2 and 15.
The below test suites are passed for this patch.
* The rv64gcv fully regression test.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i16.c: Add a
From: Pan Li
This patch would like to combine the vec_duplicate + vmul.vv to the
vmul.vx. From example as below code. The related pattern will depend
on the cost of vec_duplicate from GR2VR. Then the late-combine will
take action if the cost of GR2VR is zero, and reject the combination
if the
From: Pan Li
This patch would like to introduce the combine of vec_dup + vmul.vv into
vmul.vx on the cost value of GR2VR. The late-combine will take place if
the cost of GR2VR is zero, or reject the combine if non-zero like 1, 15
in test. There will be two cases for the combine:
Case 0:
| .
From: Pan Li
Add asm and run testcase for avg_floor vaadd implementation.
The below test suites are passed for this patch series.
* The rv64gcv fully regression test.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/avg.h: New test.
* gcc.target/riscv/rvv/autovec/avg_dat
From: Pan Li
Some existing avg_floor test need updated due to change to
leverage vaadd.vv directly.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/vls/avg-1.c: Update asm check
to vaadd.
* gcc.target/riscv/rvv/autovec/vls/avg-2.c: Ditto.
* gcc.target/ris
From: Pan Li
The signed avg_floor totally match the sematics of fixed point
rvv insn vaadd, within round down. Thus, leverage it directly
to implement the avf_floor.
The spec of RVV is somehow not that clear about the difference
between the float point and fixed point for the rounding that
disc
From: Pan Li
The spec of RVV is somehow not that clear about the difference
between the float point and fixed point for the rounding that
discard least-significant information.
For float point which is not two's complement, the "discard
least-significant information" indicates truncation round.
From: Pan Li
The signed avg_floor totally match the sematics of fixed point
rvv insn vaadd, within round down. Thus, leverage it directly
to implement the avf_floor.
The spec of RVV is somehow not that clear about the difference
between the float point and fixed point for the rounding that
disc
From: Pan Li
Add asm and run testcase for avg_floor vaadd implementation.
The below test suites are passed for this patch series.
* The rv64gcv fully regression test.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/avg.h: New test.
* gcc.target/riscv/rvv/autovec/avg_dat
From: Pan Li
Some existing avg_floor test need updated due to change to
leverage vaadd.vv directly.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/vls/avg-1.c: Update asm check
to vaadd.
* gcc.target/riscv/rvv/autovec/vls/avg-2.c: Ditto.
* gcc.target/ris
From: Pan Li
The spec of RVV is somehow not that clear about the difference
between the float point and fixed point for the rounding that
discard least-significant information.
For float point which is not two's complement, the "discard
least-significant information" indicates truncation round.
From: Pan Li
This patch would like to combine the vec_duplicate + vxor.vv to the
vxor.vx. From example as below code. The related pattern will depend
on the cost of vec_duplicate from GR2VR. Then the late-combine will
take action if the cost of GR2VR is zero, and reject the combination
if the
From: Pan Li
Add asm dump check test for vec_duplicate + vxor.vv combine to vxor.vx,
with the GR2VR cost is 0, 2 and 15.
The below test suites are passed for this patch.
* The rv64gcv fully regression test.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i16.c: Add a
From: Pan Li
Add asm dump check test for vec_duplicate + vxor.vv combine to vxor.vx,
with the GR2VR cost is 0, 1 and 2.
The below test suites are passed for this patch.
* The rv64gcv fully regression test.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i16.c: Add as
From: Pan Li
This patch would like to introduce the combine of vec_dup + vxor.vv into
vxor.vx on the cost value of GR2VR. The late-combine will take place if
the cost of GR2VR is zero, or reject the combine if non-zero like 1, 15
in test. There will be two cases for the combine:
Case 0:
| .
From: Pan Li
Add asm dump check test for vec_duplicate + vor.vv combine to vor.vx,
with the GR2VR cost is 0, 1 and 2.
The below test suites are passed for this patch.
* The rv64gcv fully regression test.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i16.c: Add asm
From: Pan Li
This patch would like to combine the vec_duplicate + vor.vv to the
vor.vx. From example as below code. The related pattern will depend
on the cost of vec_duplicate from GR2VR. Then the late-combine will
take action if the cost of GR2VR is zero, and reject the combination
if the GR
From: Pan Li
Add asm dump check test for vec_duplicate + vor.vv combine to vor.vx,
with the GR2VR cost is 0, 2 and 15.
The below test suites are passed for this patch.
* The rv64gcv fully regression test.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i16.c: Add tes
From: Pan Li
This patch would like to introduce the combine of vec_dup + vor.vv into
vor.vx on the cost value of GR2VR. The late-combine will take place if
the cost of GR2VR is zero, or reject the combine if non-zero like 1, 15
in test. There will be two cases for the combine:
Case 0:
| ...
From: Pan Li
Add asm dump check test for vec_duplicate + vand.vv combine to vand.vx,
with the GR2VR cost is 0, 2 and 15.
The below test suites are passed for this patch.
* The rv64gcv fully regression test.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i16.c: Add t
From: Pan Li
This patch would like to combine the vec_duplicate + vand.vv to the
vand.vx. From example as below code. The related pattern will depend
on the cost of vec_duplicate from GR2VR. Then the late-combine will
take action if the cost of GR2VR is zero, and reject the combination
if the
From: Pan Li
Add asm dump check test for vec_duplicate + vand.vv combine to vand.vx,
with the GR2VR cost is 0, 1 and 2.
The below test suites are passed for this patch.
* The rv64gcv fully regression test.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i16.c: Add as
From: Pan Li
This patch would like to introduce the combine of vec_dup + vand.vv into
vand.vx on the cost value of GR2VR. The late-combine will take place if
the cost of GR2VR is zero, or reject the combine if non-zero like 1, 15
in test. There will be two cases for the combine:
Case 0:
| .
From: Pan Li
Add asm dump check test for vec_duplicate + vrsub.vv combine to vrsub.vx
The below test suites are passed for this patch.
* The rv64gcv fully regression test.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/vx_vf/vx-2-i16.c: Add vrsub asm
dump check.
From: Pan Li
Add asm dump check test for vec_duplicate + vrsub.vv combine to vrsub.vx.
The below test suites are passed for this patch.
* The rv64gcv fully regression test.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/vx_vf/vx-5-i16.c: Add asm check
for vrsub with GR
From: Pan Li
Add asm dump check test for vec_duplicate + vrsub.vv combine to vrsub.vx.
The below test suites are passed for this patch.
* The rv64gcv fully regression test.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/vx_vf/vx-6-i16.c: Add asm check
for vrsub with GR
From: Pan Li
Tweak the asm check with define T uint8_t for adding more
vx test easily, as well as less possibility to make mistake.
The below test suites are passed for this patch.
* The rv64gcv fully regression test.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i
From: Pan Li
Add asm dump check test for vec_duplicate + vrsub.vv combine to vrsub.vx.
The below test suites are passed for this patch.
* The rv64gcv fully regression test.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i16.c: Add asm check
for vrsub case 1
From: Pan Li
Add asm dump check and run test for vec_duplicate + vrsub.vv combine to
vrsub.vx.
The below test suites are passed for this patch.
* The rv64gcv fully regression test.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i16.c: Add vrsub asm check.
*
From: Pan Li
Add asm dump check test for vec_duplicate + vrsub.vv combine to vrsub.vx.
The below test suites are passed for this patch.
* The rv64gcv fully regression test.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/vx_vf/vx-3-i16.c: Add asm check
for vrsub with GR
From: Pan Li
This patch would like to combine the vec_duplicate + vrub.vv to the
vrsub.vx. From example as below code. The related pattern will depend
on the cost of vec_duplicate from GR2VR. Then the late-combine will
take action if the cost of GR2VR is zero, and reject the combination
if the
From: Pan Li
This patch would like to introduce the combine of vec_dup + vsub.vv into
vsub.vx on the cost value of GR2VR. The late-combine will take place if
the cost of GR2VR is zero, or reject the combine if non-zero like 1, 15
in test. There will be two cases for the combine:
Case 0:
| .
From: Pan Li
Some of the previous scalar unsigned SAT_ADD test data are
duplicated in different test files. This patch would like to
move them into a shared header file, to avoid the test data
duplication.
The below test suites are passed for this patch series.
* The rv64gcv fully regression te
From: Pan Li
Add asm dump check test for vec_duplicate + vsub.vv combine to vsub.vx.
The below test suites are passed for this patch.
* The rv64gcv fully regression test.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i16.c: Add test cases
for vsub vx combin
From: Pan Li
For run test, we have a name like add/sub to indicate
the testcase. So we can reuse this to identify the
test data instead of a new one.
The below test suites are passed for this patch.
* The rv64gcv fully regression test.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/a
From: Pan Li
Add asm dump check test for vec_duplicate + vsub.vv combine to vsub.vx.
The below test suites are passed for this patch.
* The rv64gcv fully regression test.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/vx_vf/vx-3-i16.c: Add test cases
for vsub vx combin
From: Pan Li
Add asm dump check test for vec_duplicate + vsub.vv combine to vsub.vx.
The below test suites are passed for this patch.
* The rv64gcv fully regression test.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/vx_vf/vx-6-i16.c: Add test cases
for vsub vx combin
From: Pan Li
Given we will put all vx combine for int8 in a single file,
we need to make sure the generate function for different
types and ops has different function name. Thus, refactor
the test helper macros for avoiding possible function name
conflict.
The below test suites are passed for t
From: Pan Li
Add asm dump check and run test for vec_duplicate + vsub.vv
combine to vsub.vx.
The below test suites are passed for this patch.
* The rv64gcv fully regression test.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i16.c: Add vector sub
vx combine
From: Pan Li
Add asm dump check test for vec_duplicate + vsub.vv combine to vsub.vx.
The below test suites are passed for this patch.
* The rv64gcv fully regression test.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/vx_vf/vx-5-i16.c: Add test cases
for vsub vx combin
From: Pan Li
We would like to arrange all vx combine asm check test into
one file for better management. Thus, rename vx_vadd-* to
vx-*.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vadd-1-i16.c: Move to...
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i16.c: ..
From: Pan Li
Add asm dump check test for vec_duplicate + vsub.vv combine to vsub.vx
The below test suites are passed for this patch.
* The rv64gcv fully regression test.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/vx_vf/vx-2-i16.c: Add test cases
for vsub vx combine
From: Pan Li
This patch would like to combine the vec_duplicate + vsub.vv to the
vsub.vx. From example as below code. The related pattern will depend
on the cost of vec_duplicate from GR2VR. Then the late-combine will
take action if the cost of GR2VR is zero, and reject the combination
if the
From: Pan Li
This patch would like to introduce the combine of vec_dup + vsub.vv into
vsub.vx on the cost value of GR2VR. The late-combine will take place if
the cost of GR2VR is zero, or reject the combine if non-zero like 1, 15
in test. There will be two cases for the combine:
Case 0:
| .
From: Pan Li
Add asm dump check test for vec_duplicate + vsub.vv combine to vsub.vx.
The below test suites are passed for this patch.
* The rv64gcv fully regression test.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-4-i16.c: New test.
* gcc.target/riscv
From: Pan Li
Add asm dump check test for vec_duplicate + vsub.vv combine to vsub.vx.
The below test suites are passed for this patch.
* The rv64gcv fully regression test.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-6-i16.c: New test.
* gcc.target/riscv
From: Pan Li
Add asm dump check and run test for vec_duplicate + vsub.vv
combine to vsub.vx.
The below test suites are passed for this patch.
* The rv64gcv fully regression test.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/vx_vf/vx_binary_data.h: Add test
data for v
From: Pan Li
Add asm dump check test for vec_duplicate + vsub.vv combine to vsub.vx.
The below test suites are passed for this patch.
* The rv64gcv fully regression test.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-3-i16.c: New test.
* gcc.target/riscv
From: Pan Li
Add asm dump check test for vec_duplicate + vsub.vv combine to vsub.vx.
The below test suites are passed for this patch.
* The rv64gcv fully regression test.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-5-i16.c: New test.
* gcc.target/riscv
From: Pan Li
Add asm dump check test for vec_duplicate + vsub.vv combine to vsub.vx
The below test suites are passed for this patch.
* The rv64gcv fully regression test.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vsub-2-i16.c: New test.
* gcc.target/riscv/
From: Pan Li
This patch would like to introduce the combine of vec_dup + vsub.vv into
vsub.vx on the cost value of GR2VR. The late-combine will take place if
the cost of GR2VR is zero, or reject the combine if non-zero like 1, 15
in test. There will be two cases for the combine:
Case 0:
| .
From: Pan Li
This patch would like to combine the vec_duplicate + vsub.vv to the
vsub.vx. From example as below code. The related pattern will depend
on the cost of vec_duplicate from GR2VR. Then the late-combine will
take action if the cost of GR2VR is zero, and reject the combination
if the
From: Pan Li
Add asm dump check and for vec_duplicate + vadd.vv combine case 1 to vadd.vx.
The late-combine will take action when GR2VR cost is 0, because the vmv
and the vadd.vx will consume the same cost of GR2VR. Aka:
Before:
L1:
vmv.v.x
vadd.vv
J L1
After:
L1:
vadd.vx
J L1
The b
From: Pan Li
Add asm dump check and for vec_duplicate + vadd.vv combine case 1 to vadd.vx
with the cost of GR2VR is 2. The testcases is not that tidy according
to the result, but we will continue tuning the cost model for this.
The below test suites are passed for this patch.
* The rv64gcv full
From: Pan Li
Add asm dump check and for vec_duplicate + vadd.vv combine case 1 to vadd.vx
with the cost of GR2VR is 1. The testcases is not that tidy according
to the result, but we will continue tuning the cost model for this.
The below test suites are passed for this patch.
* The rv64gcv full
From: Pan Li
We have the testcase for vec_duplicate + vadd.vv combine as
below already, aka:
Before:
...
vmv.v.x
L1:
vadd.vv
J L1
...
After:
...
L1:
vadd.vx
J L1
...
But there is still another case like below:
Before:
...
L1:
vmv.v.x
vadd.vv
J L1
...
After:
...
From: Pan Li
The default test running in rvv.exp takes the -fno-vect-cost-model
for most of these options. It is not that suitable as the vx_vf
test depends on the cost-model. Thus, separate the vx_vf test
cases without -fno-vect-cost-model in another options.
The below test suites are passed
From: Pan Li
This patch would like to rename the VX_BINARY within CASE_0 suffix, as
we have another case of VX_BINARY test code. Aka case 1:
L1:
vmv.v.x
vadd.vv
J L1
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/vx_vf/vx_binary.h: Rename VX_BINARY
to VX_BINARY_
From: Pan Li
Add asm dump check and for vec_duplicate + vadd.vv combine to vadd.vx.
The late-combine will not take action when GR2VR cost is 15.
The below test suites are passed for this patch.
* The rv64gcv fully regression test.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec
From: Pan Li
Add asm dump check and run test for vec_duplicate + vadd.vv combine
to vadd.vx. Introduce new folder to hold all related testcases.
The below test suites are passed for this patch.
* The rv64gcv fully regression test.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/rvv.ex
From: Pan Li
This patch would like to introduce the combine of vec_dup + vadd.vv into
vadd.vx on the cost value of GR2VR. The late-combine will take place if
the cost of GR2VR is zero, or reject the combine if non-zero like 1, 15
in test. A helper function get_gr2vr_cost is introduced to make s
From: Pan Li
Add asm dump check and for vec_duplicate + vadd.vv combine to vadd.vx.
The late-combine will not take action when GR2VR cost is 1.
The below test suites are passed for this patch.
* The rv64gcv fully regression test.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/
From: Pan Li
During investigate the combine from vec_dup and vop.vv into
vop.vx, we need to depend on the cost of the insn operate
from the gpr to vr, for example, vadd.vx. Thus, for better
control and test, we introduce a new option, aka below:
--param=gpr2vr-cost=
To specific the cost value
From: Pan Li
This patch would like to combine the vec_duplicate + vadd.vv to the
vadd.vx. From example as below code. The related pattern will depend
on the cost of vec_duplicate from GR2VR, it will:
* The pattern matching will be active by default.
* The cost of GR2VR will be added to the tot
From: Pan Li
After we introduced the --param=gpr2vr-cost option to set the cost
value of when operation act from gpr to vr, we would like to introduce
a new helper function to get the cost of gp2vr. And then make sure
all reference to gr2vr should go this helper function.
The helper function wi
From: Pan Li
Add asm dump check and run test for vec_duplicate + vadd.vv combine
to vadd.vx. Introduce new folder to hold all related testcases.
The below test suites are passed for this patch.
* The rv64gcv fully regression test.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/rvv.ex
From: Pan Li
This patch would like to combine the vec_duplicate + vadd.vv to the
vadd.vx. From example as below code. The related pattern will depend
on the cost of vec_duplicate from GR2VR, it will:
* The pattern matching will be active by default.
* The cost of GR2VR will be added to the tot
From: Pan Li
Add asm dump check and for vec_duplicate + vadd.vv combine to vadd.vx.
The late-combine will not take action when GR2VR cost is 1.
The below test suites are passed for this patch.
* The rv64gcv fully regression test.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/
From: Pan Li
During investigate the combine from vec_dup and vop.vv into
vop.vx, we need to depend on the cost of the insn operate
from the gr to vr, for example, vadd.vx. Thus, for better
control and test, we introduce a new option, aka below:
--param=rvv-gr2vr-cost=
To specific the cost valu
From: Pan Li
Add asm dump check and for vec_duplicate + vadd.vv combine to vadd.vx.
The late-combine will not take action when GR2VR cost is 15.
The below test suites are passed for this patch.
* The rv64gcv fully regression test.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec
From: Pan Li
This patch would like to introduce the combine of vec_dup + vadd.vv into
vadd.vx on the cost value of GR2VR. The late-combine will take place if
the cost of GR2VR is zero, or reject the combine if non-zero like 1, 15
in test.
The below test suites are passed for this patch series.
From: Pan Li
This patch will add testcase for unsigned integer SAT_ADD form 7:
#define DEF_SAT_U_ADD_FMT_7(WT, T) \
T __attribute__((noinline))\
sat_u_add_##WT##_##T##_fmt_7(T x, T y) \
{ \
T max = -1; \
From: Pan Li
This patch serices would like to support form 7 of the unsigned integer
SAT_ADD. Different to another forms of SAT_ADD, the form 7 will
leverage a wider type to tell overflow or not, aka:
#define DEF_SAT_U_ADD_FMT_7(WT, T) \
T __attribute__((noinline))\
sat_u_
From: Pan Li
This patch would like to support the form 7 of the unsigned
integer SAT_ADD, aka below example.
#define DEF_SAT_U_ADD_FMT_7(WT, T) \
T __attribute__((noinline))\
sat_u_add_##WT##_##T##_fmt_7(T x, T y) \
{ \
T max = -1;
From: Pan Li
This patch will add testcase for unsigned integer SAT_ADD form 7:
#define DEF_VEC_SAT_U_ADD_FMT_9(WT, T) \
void __attribute__((noinline)) \
vec_sat_u_add_##WT##_##T##_fmt_9 (T *out, T *op_1, T *o
From: Pan Li
Consider the expand_const_vector is quit long (about 500 lines)
and complicated, we would like to extract the different case
into different functions. For example, the const vector stepped
will be extracted into expand_const_vector_stepped.
The below test suites are passed for this
From: Pan Li
Consider the expand_const_vector is quit long (about 500 lines)
and complicated, we would like to extract the different case
into different functions. For example, the const vector duplicate
will be extracted into expand_const_vector_duplicate, and then
expand_const_vector_duplicate
From: Pan Li
Consider the expand_const_vector is quit long (about 500 lines)
and complicated, we would like to extract the different case
into different functions. For example, the const vec_series
will be extracted into expand_const_vec_series.
The below test suites are passed for this patch.
From: Pan Li
Consider the expand_const_vector is quit long (about 500 lines)
and complicated, we would like to extract the different case
into different functions. For example, the const vec_duplicate
will be extracted into expand_const_vec_duplicate.
The below test suites are passed for this p
From: Pan Li
Per discussion from PR118931 thread, the expand_const_vector is
quit long with more than 500 lines, which is unfriendly for debugging
and maintaince. Thus, we extract some sub functions to make it clear
and delicate the concrete const vector expanding into sub functions.
Aka:
expan
From: Pan Li
Add asm dump check and run test for vec_duplicate + vadd.vv combine
to vadd.vx. Introduce new folder to hold all related testcases.
The below test suites are passed for this patch.
* The rv64gcv fully regression test.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/rvv.ex
From: Pan Li
This patch would like to combine the vec_duplicate + vadd.vv to the
vadd.vx. From example as below code. The related pattern will depend
on the cost of vec_duplicate from GR2VR, it will:
* The pattern matching will be inactive if GR2VR cost is zero.
* The cost of GR2VR will be add
From: Pan Li
This patch series would like to introudce the vec_dup + vadd.vv combine
to vadd.vx, based on the cost of the GR2VR. For example as below.
v1 = vec_dup(x2)
v2 = vec_add_vv(v3, v1)
will be optimized to below in late-combine
v2 = vec_add_vx(v3, x3)
If and only if the cost of (vec_d
From: Pan Li
After we support the vec_duplicate + vadd.vv combine to vadd.vx, the
existing testcases need some adjust for asm dump check times.
The below test suites are passed for this patch.
* The rv64gcv fully regression test.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/
From: Pan Li
Add asm dump check and run test for vec_duplicate + vadd.vv combine
to vadd.vx. Introduce new folder to hold all related testcases.
The below test suites are passed for this patch.
* The rv64gcv fully regression test.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/rvv.ex
From: Pan Li
After we support the vec_duplicate + vadd.vv combine to vadd.vx, the
existing testcases need some adjust for asm dump check times.
The below test suites are passed for this patch.
* The rv64gcv fully regression test.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/
From: Pan Li
This patch would like to combine the vec_duplicate + vadd.vv to the
vadd.vx. From example as below code. The related pattern will depend
on the cost of vec_duplicate from GR2VR, it will:
* The pattern matching will be inactive if GR2VR cost is zero.
* The cost of GR2VR will be add
From: Pan Li
After we add the frm register to the global_regs, we may not need to
define_insn that volatile to emit the frm restore insns. The
cooperatively-managed global register will help to handle this, instead
of emit the volatile define_insn explicitly.
gcc/ChangeLog:
* config/ri
From: Pan Li
Rearrange the test cases of cond_widen_complicate-3 by different types
into different files, instead of put all types together. Then we can
easily reduce the range when asm check fails.
The below test suites are passed locally, let's wait online CI says.
* The rv64gcv fully regress
1 - 100 of 640 matches
Mail list logo