On Thu, 26 Jun 2014, Jakub Jelinek wrote:
So like this? I've also changed get_compute_type so that it will DTRT even for -mavx and V4DImode vectors, so e.g. f5/f6/f8 routines in avx-pr57233.c improve. Also, even for shifts by scalar, if e.g. target doesn't have shifts by scalar at all, and only has narrower vector by vector shifts, it should handle this case too.
All that? Cool!
@@ -1455,11 +1507,83 @@ expand_vector_operations_1 (gimple_stmt_ { op = optab_for_tree_code (code, type, optab_scalar); + compute_type = get_compute_type (code, op, type); + if (compute_type == type) + return; /* The rtl expander will expand vector/scalar as vector/vector - if necessary. Don't bother converting the stmt here. */ - if (optab_handler (op, TYPE_MODE (type)) == CODE_FOR_nothing - && optab_handler (opv, TYPE_MODE (type)) != CODE_FOR_nothing) + if necessary. Pick one with wider vector type. */ + tree compute_vtype = get_compute_type (code, opv, type); + if (count_type_subparts (compute_vtype) + > count_type_subparts (compute_type)) + { + compute_type = compute_vtype; + op = opv; + } + } + + if (code == LROTATE_EXPR || code == RROTATE_EXPR) + { + if (compute_type == NULL_TREE) + compute_type = get_compute_type (code, op, type); + if (compute_type == type) return; + /* Before splitting vector rotates into scalar rotates, + see if we can't use vector shifts and BIT_IOR_EXPR + instead. For vector by vector rotates we'd also + need to check BIT_AND_EXPR and NEGATE_EXPR, punt there + for now, fold doesn't seem to create such rotates anyway. */ + if (compute_type == TREE_TYPE (type) + && !VECTOR_INTEGER_TYPE_P (TREE_TYPE (rhs2))) + { + optab oplv, opl, oprv, opr, opo; + oplv = optab_for_tree_code (LSHIFT_EXPR, type, optab_vector); + /* Right shift always has to be logical, no matter what + signedness type has. */ + oprv = vlshr_optab; + opo = optab_for_tree_code (BIT_IOR_EXPR, type, optab_default); + opl = optab_for_tree_code (LSHIFT_EXPR, type, optab_scalar); + oprv = lshr_optab; + opr = optab_for_tree_code (RSHIFT_EXPR, type, optab_scalar);
Looks like there are some typos in there, you are assigning to oprv twice. -- Marc Glisse