On 07/09/16 20:03, Bernhard Reutner-Fischer wrote:
On September 6, 2016 5:14:47 PM GMT+02:00, Kyrill Tkachov 
<kyrylo.tkac...@foss.arm.com> wrote:
Hi all,
s/contigous/contiguous/
s/ where where/ where/

+struct merged_store_group
+{
+  HOST_WIDE_INT start;
+  HOST_WIDE_INT width;
+  unsigned char *val;
+  unsigned int align;
+  auto_vec<struct store_immediate_info *> stores;
+  /* We record the first and last original statements in the sequence because
+     because we'll need their vuse/vdef and replacement position.  */
+  gimple *last_stmt;

s/ because because/ because/

Why aren't these two HWIs unsigned, likewise in store_immediate_info and in 
most other spots in the patch?

+         fprintf (dump_file, "Afer writing ");
s/Afer /After/

/access if prohibitively slow/s/ if /is /

I'd get rid of successful_p in imm_store_chain_info::output_merged_stores.

Thanks, fixed all the above in my tree (will be retesting).


+unsigned int
+pass_store_merging::execute (function *fun)
+{
+  basic_block bb;
+  hash_set<gimple *> orig_stmts;
+
+  FOR_EACH_BB_FN (bb, fun)
+    {
+      gimple_stmt_iterator gsi;
+      HOST_WIDE_INT num_statements = 0;
+      /* Record the original statements so that we can keep track of
+        statements emitted in this pass and not re-process new
+        statements.  */
+      for (gsi = gsi_after_labels (bb); !gsi_end_p (gsi); gsi_next (&gsi))
+       {
+         gimple_set_visited (gsi_stmt (gsi), false);
+         num_statements++;
+       }
+
+      if (num_statements < 2)
+       continue;

What about debug statements? ISTM you should skip those.
(Isn't visited reset before entry of a pass?)

Yes, I'll skip debug statements. Comments in gimple.h say that the visited
property is undefined at pass boundaries, so I'd rather initialize it here.


Maybe I missed the bikeshedding about the name but I'd have used -fmerge-stores 
instead.

Wouldn't be hard to change. I can change it any point if there's a consensus.

Thanks for the feedback.
Kyrill

Thanks,
The v3 of this patch addresses feedback I received on the version
posted at [1].
The merged store buffer is now represented as a char array that we
splat values onto with
native_encode_expr and native_interpret_expr. This allows us to merge
anything that native_encode_expr
accepts, including floating point values and short vectors. So this
version extends the functionality
of the previous one in that it handles floating point values as well.

The first phase of the algorithm that detects the contiguous stores is
also slightly refactored according
to feedback to read more fluently.

Richi, I experimented with merging up to MOVE_MAX bytes rather than
word size but I got worse results on aarch64.
MOVE_MAX there is 16 (because it has load/store register pair
instructions) but the 128-bit immediates that we ended
synthesising were too complex. Perhaps the TImode immediate store RTL
expansions could be improved, but for now
I've left the maximum merge size to be BITS_PER_WORD.

I've disabled the pass for PDP-endian targets as the merging code
proved to be quite fiddly to get right for different
endiannesses and I didn't feel comfortable writing logic for
BYTES_BIG_ENDIAN != WORDS_BIG_ENDIAN targets without serious
testing capabilities. I hope that's ok (I note the bswap pass also
doesn't try to do anything on such targets).

Tested on arm, aarch64, x86_64 and on big-endian arm and aarch64.

How does this version look?
Thanks,
Kyrill

[1] https://gcc.gnu.org/ml/gcc-patches/2016-08/msg01512.html

2016-09-06  Kyrylo Tkachov  <kyrylo.tkac...@arm.com>

     PR middle-end/22141
     * Makefile.in (OBJS): Add gimple-ssa-store-merging.o.
     * common.opt (fstore-merging): New Optimization option.
     * opts.c (default_options_table): Add entry for
     OPT_ftree_store_merging.
     * params.def (PARAM_STORE_MERGING_ALLOW_UNALIGNED): Define.
     * passes.def: Insert pass_tree_store_merging.
     * tree-pass.h (make_pass_store_merging): Declare extern
     prototype.
     * gimple-ssa-store-merging.c: New file.
     * doc/invoke.texi (Optimization Options): Document
     -fstore-merging.

2016-09-06  Kyrylo Tkachov  <kyrylo.tkac...@arm.com>
             Jakub Jelinek  <ja...@redhat.com>

     PR middle-end/22141
     * gcc.c-torture/execute/pr22141-1.c: New test.
     * gcc.c-torture/execute/pr22141-2.c: Likewise.
     * gcc.target/aarch64/ldp_stp_1.c: Adjust for -fstore-merging.
     * gcc.target/aarch64/ldp_stp_4.c: Likewise.
     * gcc.dg/store_merging_1.c: New test.
     * gcc.dg/store_merging_2.c: Likewise.
     * gcc.dg/store_merging_3.c: Likewise.
     * gcc.dg/store_merging_4.c: Likewise.
     * gcc.dg/store_merging_5.c: Likewise.
     * gcc.dg/store_merging_6.c: Likewise.
     * gcc.target/i386/pr22141.c: Likewise.
     * gcc.target/i386/pr34012.c: Add -fno-store-merging to dg-options.


Reply via email to