Hi Richard,
On 18/07/16 13:22, Richard Biener wrote:
On Fri, Jul 15, 2016 at 5:13 PM, Kyrill Tkachov
<kyrylo.tkac...@foss.arm.com> wrote:
<snip>
+ /* Record the original statements so that we can keep track of
+ statements emitted in this pass and not re-process new
+ statements. */
+ for (gsi = gsi_after_labels (bb); !gsi_end_p (gsi); gsi_next (&gsi))
+ {
+ gimple *stmt = gsi_stmt (gsi);
+ if (!is_gimple_debug (stmt))
+ orig_stmts.add (stmt);
+ num_statements++;
+ }
please use gimple_set_visited () instead, that should be cheaper.
+ do
+ {
+ changes_made = false;
+ for (gsi = gsi_after_labels (bb); !gsi_end_p (gsi); gsi_next (&gsi))
+ {
...
+ }
+ while (changes_made);
looks pretty quadratic to me. Instead of tracking things with m_curr_base_expr
why not use a hash-map to track stores related to a base?
+ /* Don't handle bitfields that are not a multiple
+ of BITS_PER_UNIT for now. Can be extended
+ later. */
+ && (bitsize % BITS_PER_UNIT == 0)
:(
+ && !stmt_interferes_with_mem_accesses_p (lhs))
given this function loops over all accesses this is quadratic as well.
I think alias queries could be simplified if you reduce them to alias
checks on the base object (and allow overlapping constant stores
which should be easy to handle during merging).
I've implemented that and it simplified the detecting code as well as its
complexity
but I'm now missing some cases that were being caught before.
An example is:
struct bar {
int a;
char b;
char c;
char d;
char e;
char g;
};
void
foo1 (struct bar *p, char tmp)
{
p->a = 0;
p->b = 0;
p->g = tmp;
p->c = 0;
p->d = 0;
p->e = 0;
}
The store to 'g' doesn't interfere with the contiguous stores to the early
fields but because
we perform the aliasing checks on the base object 'p' rather than the full LHS
of the assignments
this is deemed to alias the constant stores and only the first two and the last
three constant stores
are merged instead of the full 5 stores.
I'll experiment with some solutions for this involving recording the
non-constant stores and performing
some trickery during the merging phase.
Thanks,
Kyrill
The VUSE/VDEF handling is somewhat odd. All stores have both
a VDEF and a VUSE, if you merge a set of them you can simply
re-use the VDEF/VUSE of one of them, effectively replacing the
stmt. For all other stores you remove you miss a
unlink_stmt_vdef (stmt);
release_defs (stmt);
to update virtual SSA form and properly release SSA names.
As you use TBAA in your alias checks you may only _sink_
stores. Not sure if your merged store placement ensures this.
I think this kind of transforms is useful in early optimizations, not only
very late as you perform it. Of course it should be really cheap there.
Note that options like -ftree-store-widening are discouraged
("tree" does mean nothing to our users and store widening isn't
what is done - it's store merging). Simply name it -fstore-merging.
Also the file name should be gimple-ssa-store-merging.c
Looking forward to an algorithmically enhanced version.
Richard.
Thanks,
Kyrill
N.B. I'm going on vacation until August so I won't be able to respond to any
feedback until I get back.
[1] https://gcc.gnu.org/ml/gcc-patches/2009-09/msg01745.html
2016-07-15 Kyrylo Tkachov <kyrylo.tkac...@arm.com>
PR middle-end/22141
* Makefile.in (OBJS): Add tree-ssa-store-widening.o.
* common.opt (ftree-store-widening): New Optimization option.
* opts.c (default_options_table): Add entry for
OPT_ftree_store_widening.
* params.def (PARAM_STORE_WIDENING_ALLOW_UNALIGNED): Define.
* passes.def: Insert pass_tree_store_widening.
* tree-pass.h (make_pass_tree_store_widening): Declare extern
prototype.
* tree-ssa-store-widening.c: New file.
* doc/invoke.texi (Optimization Options): Document
-ftree-store-widening.
2016-07-15 Kyrylo Tkachov <kyrylo.tkac...@arm.com>
Jakub Jelinek <ja...@redhat.com>
PR middle-end/22141
* gcc.c-torture/execute/pr22141-1.c: New test.
* gcc.c-torture/execute/pr22141-2.c: Likewise.
* gcc.target/aarch64/ldp_stp_1.c: Adjust for -ftree-store-widening.
* gcc.dg/store_widening_1.c: New test.
* gcc.dg/store_widening_2.c: Likewise.
* gcc.dg/store_widening_3.c: Likewise.
* gcc.dg/store_widening_4.c: Likewise.
* gcc.target/i386/pr22141.c: Likewise.