On Fri, Apr 4, 2014 at 7:49 AM, Thomas Preud'homme <thomas.preudho...@arm.com> wrote: >> From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches- >> ow...@gcc.gnu.org] On Behalf Of Rainer Orth >> >> Just omit the { target *-*-* } completely, also a few more times. > > Please find attached an updated patch.
@@ -1733,6 +1743,51 @@ find_bswap_1 (gimple stmt, struct symbolic_number *n, int limit) to initialize the symbolic number. */ if (!source_expr1) { + n->base_addr = n->offset = NULL_TREE; + if (is_gimple_assign (rhs1_stmt)) you want gimple_assign_load_p (rhs1_stmt) && !gimple_has_volatile_ops (rhs1_stmt) here. + case ARRAY_RANGE_REF: For ARRAY_RANGE_REF this doesn't make much sense IMHO. + case ARRAY_REF: + n->base_addr = TREE_OPERAND (var, 0); + elt_size = array_ref_element_size (var); + if (TREE_CODE (elt_size) != INTEGER_CST) + return NULL_TREE; + index = TREE_OPERAND (var, 1); + if (TREE_THIS_VOLATILE (var) || TREE_THIS_VOLATILE (index)) + return NULL_TREE; + n->offset = fold_build2 (MULT_EXPR, sizetype, + index, elt_size); You fail to honor array_ref_low_bound. With handling only the outermost handled-component and then only a selected subset you'll catch many but not all cases. Why not simply use get_inner_reference () here (plus stripping the constant offset from an innermost MEM_REF) and get the best of both worlds (not duplicate parts of its logic and handle more cases)? Eventually using tree-affine.c and get_inner_reference_aff is even more appropriate so you can compute the address differences without decomposing them yourselves. Btw, I think for the recursion to work properly you need to handle loads for the toplevel stmt, not for rhs1_stmt. Eventually you need to split out a find_bswap_2 that handles the recursion case that allows loads from the case that doesn't called from find_bswap. + /* Compute address to load from and cast according to the size + of the load. */ + load_ptr_type = build_pointer_type (load_type); + addr_expr = build1 (ADDR_EXPR, load_ptr_type, bswap_src); + addr_tmp = make_temp_ssa_name (load_ptr_type, NULL, "load_src"); + addr_stmt = gimple_build_assign_with_ops + (NOP_EXPR, addr_tmp, addr_expr, NULL); + gsi_insert_before (&gsi, addr_stmt, GSI_SAME_STMT); + + /* Perform the load. */ + load_offset_ptr = build_int_cst (load_ptr_type, 0); + val_tmp = make_temp_ssa_name (load_type, NULL, "load_dst"); + val_expr = build2 (MEM_REF, load_type, addr_tmp, load_offset_ptr); + load_stmt = gimple_build_assign_with_ops + (MEM_REF, val_tmp, val_expr, NULL); this is unnecessarily complex and has TBAA issues. You don't need to create a "correct" pointer type, so doing addr_expr = fold_build_addr_expr (bswap_src); is enough. Now, to fix the TBAA issues you either need to remember and "combine" the reference_alias_ptr_type of each original load and use that for the load_offset_ptr value or decide that isn't worth it and use alias-set zero (use ptr_type_node). Can you also expand the comment about "size vs. range"? Is it that range can be bigger than size if you have (short)a[0] | ((short)a[3] << 1) sofar where size == 2 but range == 3? Thus range can also be smaller than size for example for (short)a[0] | ((short)a[0] << 1) where range would be 1 and size == 2? I suppose adding two examples like this to the comment, together with the expected value of 'n' would help here. Otherwise the patch looks good. Now we're only missing the addition of trying to match to a VEC_PERM_EXPR with a constant mask using can_vec_perm_p ;) Thanks, Richard. > Best regards, > > Thomas