https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111751
--- Comment #7 from JuzheZhong <juzhe.zhong at rivai dot ai> --- (In reply to Andrew Pinski from comment #6) > (In reply to JuzheZhong from comment #5) > > (In reply to Andrew Pinski from comment #4) > > > The issue for aarch64 with SVE is that MASK_LOAD is not optimized: > > > > > > ic = "\x00\x03\x06\t\f\x0f\x12\x15\x18\x1b\x1e!$\'*-"; > > > ib = "\x00\x03\x06\t\f\x0f\x12\x15\x18\x1b\x1e!$\'*-"; > > > vect__1.7_9 = .MASK_LOAD (&ib, 8B, { -1, -1, -1, -1, -1, -1, -1, -1, -1, > > > -1, -1, -1, -1, -1, -1, -1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, > > > 0, > > > ... }); > > > vect__2.10_35 = .MASK_LOAD (&ic, 8B, { -1, -1, -1, -1, -1, -1, -1, -1, > > > -1, > > > -1, -1, -1, -1, -1, -1, -1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, > > > 0, > > > ... }); > > > > I don't ARM SVE has issues ... > > It does as I mentioned if you use -fno-vect-cost-model, you get the above > issue which should be optimized really to a constant vector ... After investigation: I found it failed to recognize its CONST_VECTOR value in FRE /* Visit a load from a reference operator RHS, part of STMT, value number it, and return true if the value number of the LHS has changed as a result. */ static bool visit_reference_op_load (tree lhs, tree op, gimple *stmt) { bool changed = false; tree result; vn_reference_t res; tree vuse = gimple_vuse (stmt); tree last_vuse = vuse; result = vn_reference_lookup (op, vuse, default_vn_walk_kind, &res, true, &last_vuse); /* We handle type-punning through unions by value-numbering based on offset and size of the access. Be prepared to handle a type-mismatch here via creating a VIEW_CONVERT_EXPR. */ if (result && !useless_type_conversion_p (TREE_TYPE (result), TREE_TYPE (op))) { /* Avoid the type punning in case the result mode has padding where the op we lookup has not. */ if (maybe_lt (GET_MODE_PRECISION (TYPE_MODE (TREE_TYPE (result))), GET_MODE_PRECISION (TYPE_MODE (TREE_TYPE (op))))) result = NULL_TREE; .... The result is BLKmode, op is V16QImode Then reach /* Avoid the type punning in case the result mode has padding where the op we lookup has not. */ if (maybe_lt (GET_MODE_PRECISION (TYPE_MODE (TREE_TYPE (result))), GET_MODE_PRECISION (TYPE_MODE (TREE_TYPE (op))))) result = NULL_TREE; If I delete this code, RVV can optimize it. Do you have any suggestion ? This is my observation: Breakpoint 6, visit_reference_op_load (lhs=0x7ffff68364c8, op=0x7ffff6874410, stmt=0x7ffff6872640) at ../../../../gcc/gcc/tree-ssa-sccvn.cc:5740 5740 result = vn_reference_lookup (op, vuse, default_vn_walk_kind, &res, true, &last_vuse); (gdb) c Continuing. Breakpoint 6, visit_reference_op_load (lhs=0x7ffff68364c8, op=0x7ffff6874410, stmt=0x7ffff6872640) at ../../../../gcc/gcc/tree-ssa-sccvn.cc:5740 5740 result = vn_reference_lookup (op, vuse, default_vn_walk_kind, &res, true, &last_vuse); (gdb) n 5746 && !useless_type_conversion_p (TREE_TYPE (result), TREE_TYPE (op))) (gdb) p debug (result) "\x00\x03\x06\t\f\x0f\x12\x15\x18\x1b\x1e!$\'*-" $9 = void (gdb) p op->typed.type->type_common.mode $10 = E_V16QImode