On 9/10/23 21:42, juzhe.zh...@rivai.ai wrote:
Ping this patch.
I think it's time to enable scalable vectorization by default and do the
whole regression every time (except vect.exp that we didn't enable yet)
Update current FAILs status:
Real FAILS (ICE and execution FAIL):
FAIL: gcc.dg/pr70252.c (internal compiler error: in
gimple_expand_vec_cond_expr, at gimple-isel.cc:284)
FAIL: gcc.dg/pr70252.c (test for excess errors)
FAIL: gcc.dg/pr92301.c execution test
Robin is working on these 3 issues and will be solved soon.
FAIL: g++.dg/torture/vshuf-v4df.C -O2 -flto -fno-use-linker-plugin
-flto-partition=none (internal compiler error: in as_a, at machmode.h:381)
FAIL: g++.dg/torture/vshuf-v4df.C -O2 -flto -fno-use-linker-plugin
-flto-partition=none (test for excess errors)
FAIL: g++.dg/torture/vshuf-v4df.C -O2 -flto -fuse-linker-plugin
-fno-fat-lto-objects (internal compiler error: in as_a, at machmode.h:381)
FAIL: g++.dg/torture/vshuf-v4df.C -O2 -flto -fuse-linker-plugin
-fno-fat-lto-objects (test for excess errors)
This is a long time known issue I have mentioned many times, we need
help for LTO since it's caused by mode bits extension.
The rest bogus FAILs:
FAIL: gcc.dg/unroll-8.c scan-rtl-dump loop2_unroll "Not unrolling loop,
doesn't roll"
FAIL: gcc.dg/unroll-8.c scan-rtl-dump loop2_unroll "likely upper bound: 6"
FAIL: gcc.dg/unroll-8.c scan-rtl-dump loop2_unroll "realistic bound: -1"
FAIL: gcc.dg/var-expand1.c scan-rtl-dump loop2_unroll "Expanding
Accumulator"
FAIL: gcc.dg/tree-ssa/cunroll-16.c scan-tree-dump cunroll "optimized:
loop with [0-9]+ iterations completely unrolled"
FAIL: gcc.dg/tree-ssa/cunroll-16.c scan-tree-dump-not optimized "foo"
FAIL: gcc.dg/tree-ssa/forwprop-40.c scan-tree-dump-times optimized
"BIT_FIELD_REF" 0
FAIL: gcc.dg/tree-ssa/forwprop-40.c scan-tree-dump-times optimized
"BIT_INSERT_EXPR" 0
FAIL: gcc.dg/tree-ssa/forwprop-41.c scan-tree-dump-times optimized
"BIT_FIELD_REF" 0
FAIL: gcc.dg/tree-ssa/forwprop-41.c scan-tree-dump-times optimized
"BIT_INSERT_EXPR" 1
FAIL: gcc.dg/tree-ssa/gen-vect-11b.c scan-tree-dump-times vect
"vectorized 0 loops" 1
FAIL: gcc.dg/tree-ssa/gen-vect-11c.c scan-tree-dump-times vect
"vectorized 0 loops" 1
FAIL: gcc.dg/tree-ssa/gen-vect-26.c scan-tree-dump-times vect "Alignment
of access forced using peeling" 1
FAIL: gcc.dg/tree-ssa/gen-vect-28.c scan-tree-dump-times vect "Alignment
of access forced using peeling" 1
FAIL: gcc.dg/tree-ssa/loop-bound-1.c scan-tree-dump ivopts "bounded by 254"
FAIL: gcc.dg/tree-ssa/loop-bound-2.c scan-tree-dump ivopts "bounded by 254"
FAIL: gcc.dg/tree-ssa/predcom-2.c scan-tree-dump-times pcom "Unrolling 2
times." 2
FAIL: gcc.dg/tree-ssa/predcom-4.c scan-tree-dump-times pcom "Combination" 1
FAIL: gcc.dg/tree-ssa/predcom-4.c scan-tree-dump-times pcom "Unrolling 3
times." 1
FAIL: gcc.dg/tree-ssa/predcom-5.c scan-tree-dump-times pcom "Combination" 2
FAIL: gcc.dg/tree-ssa/predcom-5.c scan-tree-dump-times pcom "Unrolling 3
times." 1
FAIL: gcc.dg/tree-ssa/predcom-9.c scan-tree-dump pcom "Executing
predictive commoning without unrolling"
FAIL: gcc.dg/tree-ssa/reassoc-46.c scan-tree-dump-times optimized
"(?:vect_)?sum_[\\d._]+ = (?:(?:vect_)?_[\\d._]+ \\+
(?:vect_)?sum_[\\d._]+|(?:v ect_)?sum_[\\d._]+ \\+ (?:vect_)?_[\\d._]+)" 1
FAIL: gcc.dg/tree-ssa/scev-10.c scan-tree-dump-times ivopts "
Type:\\tREFERENCE ADDRESS\n" 1
FAIL: gcc.dg/tree-ssa/scev-11.c scan-tree-dump-times ivopts "
Type:\\tREFERENCE ADDRESS\n" 2
FAIL: gcc.dg/tree-ssa/scev-14.c scan-tree-dump ivopts "Overflowness wrto
loop niter:\tNo-overflow"
FAIL: gcc.dg/tree-ssa/scev-9.c scan-tree-dump-times ivopts "
Type:\\tREFERENCE ADDRESS\n" 1
FAIL: gcc.dg/tree-ssa/split-path-11.c scan-tree-dump-times split-paths
"join point for if-convertable half-diamond" 1
These are bogus dump FAILs and I have 100% confirm each of them, we are
having same behavior as SVE.
So is this patch ok for trunk ?
Yes, this is OK for the trunk.
Jeff