Hi Bernd,
+ rtx op0 = force_reg (direct_mode, arg1_rtx);
+ rtx op1 = force_reg (direct_mode, arg2_rtx);
+ rtx tem = emit_store_flag (target, NE, op0, op1,
+ direct_mode, true, false);
This is me being ignorant here... wouldn't it be easier to have a new
cmpmem_eq pattern (and resulting optab) than to generate this code
sequence directly ? That way backends can choose to support this
optimization, and if they do, they can also choose to support longer
lengths of comparison.
DEF_LIB_BUILTIN (BUILT_IN_MEMCMP, "memcmp",
BT_FN_INT_CONST_PTR_CONST_PTR_SIZE, ATTR_PURE_NOTHROW_NONNULL_LEAF)
+DEF_GCC_BUILTIN (BUILT_IN_MEMCMP_EQ, "__memcmp_eq",
BT_FN_INT_CONST_PTR_CONST_PTR_SIZE, ATTR_PURE_NOTHROW_NONNULL_LEAF)
DEF_LIB_BUILTIN_CHKP (BUILT_IN_MEMCPY, "memcpy",
BT_FN_PTR_PTR_CONST_PTR_SIZE, ATTR_RET1_NOTHROW_NONNULL_LEAF)
Presumably you would also document this new builtin in doc/extend.texi ?
Plus maybe add a testcase for it as well ?
+ /* If the return value is used, don't do the transformation. */
This comment struck me as wrong. Surely if the return value is not used
then the entire memcmp can be transformed into nothing. Plus if the
return value is used, but only for an equality comparison with zero then
the transformation can take place.
Cheers
Nick