On Thu, Apr 24, 2014 at 3:34 AM, Thomas Preud'homme
<thomas.preudho...@arm.com> wrote:
> See updated part 2 of the patch in attachment. Part 1 is unchanged. New 
> ChangeLog are as follows:
>
> *** gcc/ChangeLog ***
>
> 2014-04-23  Thomas Preud'homme  <thomas.preudho...@arm.com>
>
>         PR tree-optimization/54733
>         * tree-ssa-math-opts.c (find_bswap_load): New.
>         (find_bswap_1): Add support for memory source.
>         (find_bswap): Likewise.
>         (execute_optimize_bswap): Likewise.
>         * expr.c (get_inner_reference): Renamed to ...
>         (get_inner_reference_1): This. Also add a parameter to control whether
>         a MEM_REF should be split into base + offset.
>         (get_inner_reference): New wrapper around get_inner_reference_1 for
>         compatibility.
>         (get_inner_reference_base): New wrapper around get_inner_reference_1
>         to access its new facet.
>         * tree.h (get_inner_reference_base): Declare.
>
>
> *** gcc/testsuite/ChangeLog ***
>
> 2014-04-23  Thomas Preud'homme  <thomas.preudho...@arm.com>
>
>         PR tree-optimization/54733
>         * gcc.dg/optimize-bswapdi-3.c: New test to check extension of bswap
>         optimization to support memory sources.
>         * gcc.dg/optimize-bswaphi-1.c: Likewise.
>         * gcc.dg/optimize-bswapsi-2.c: Likewise.
>         * gcc.c-torture/execute/bswap-2.c: Likewise.
>
> Bootstrapped on x86_64-linux-gnu with no testsuite regression. Also did a 
> arm-none-eabi cross build with no regression after running testsuite via qemu

+/* Given an expression EXP that is a handled_component_p,
+   look for the base of the ultimate containing object, which is returned and
+   specify the access position and size.  */
+
+tree
+get_inner_reference_base (tree exp, HOST_WIDE_INT *pbitsize,
+                         HOST_WIDE_INT *pbitpos, tree *poffset,
+                         enum machine_mode *pmode, int *punsignedp,
+                         int *pvolatilep, bool keep_aligning)
+{

this name isn't exactly spot-on and it's behavior doesn't appear to be
consistent for TARGET_MEM_REF (which get_inner_reference_1
simply returns unchanged).  It returns the base object or address
of the ultimate containing object (not handling TARGET_MEM_REF
should be simply documented), so I'd say instead of inventing
a new name add an overload instead (with the extra flag).

I'd name the flag 'allow_address', not 'ptr_base' (as you don't return
a pointer to a decl but the decl itself if the base is a decl).

+  bitpos /= BITS_PER_UNIT;
+  bitpos_expr = build_int_cst (sizetype, bitpos);

bitpos_expr = size_int (bitpos);

and it's really bytepos_expr now, no?

+  if (n->offset)
+    {
+      bitpos_expr = build_int_cst (sizetype, bitpos);
+      n->offset = fold_build2 (PLUS_EXPR, sizetype, n->offset, bitpos_expr);
+    }
+  else
+    n->offset = build_int_cst (sizetype, bitpos);

n->offset = bitpos_expr;

btw, I wonder why you glob together bitpos and offset again here.
I'd simply record and compare n->bitpos.

Which means

+             off_cmp_expr = fold_build2 (LT_EXPR, TREE_TYPE (n1.offset),
+                                         n2.offset, n1.offset);
+             if (TREE_CODE (off_cmp_expr) != INTEGER_CST)
+               return NULL_TREE;

you'd simply compare n1.offset and n2.offset like you compare the
bases (use operand_equal_p ()) and reject differences and then
the constant offset part (tracked in bitpos) can be operated on with
integer math instead of trees.

+             /*  Compute address to load from and cast according to the size
+                 of the load.  */
+             addr_expr = build_fold_addr_expr (unshare_expr (bswap_src));
+             addr_tmp = make_temp_ssa_name (TREE_TYPE (addr_expr), NULL,
+                                            "load_src");
+             addr_stmt = gimple_build_assign (addr_tmp, addr_expr);
+             gsi_insert_before (&gsi, addr_stmt, GSI_SAME_STMT);
+
+             /* Perform the load.  */
+             load_offset_ptr = build_int_cst (alias_type, 0);
+             val_tmp = make_temp_ssa_name (load_type, NULL, "load_dst");
+             val_expr = build2 (MEM_REF, load_type, addr_tmp, load_offset_ptr);
+             load_stmt = gimple_build_assign_with_ops
+                        (MEM_REF, val_tmp, val_expr, NULL);

to not cause spurious TREE_ADDRESSABLE bases you should
avoid the load_src temporary if possible:

   if (is_gimple_min_invariant (addr_expr))
    addr_tmp = addr_expr;
   else
      {
         addr_tmp = make_temp ....

and then use fold_build2 to build the MEM_REF.  Also use
gimple_build_assign for the load, not the with_ops variant.

The rest of the patch looks good to me now, sorry for the delay
in reviewing.

Thanks,
Richard.


> Best regards,
>
> Thomas

Reply via email to