On 02/14/2016 11:38 AM, Richard Biener wrote:
On February 14, 2016 5:35:13 PM GMT+01:00, Jeff Law <l...@redhat.com>
wrote:
On 02/12/2016 10:21 AM, Jeff Law wrote:
But really we simply need a better DSE algorithm.
So the way to fix DSE is to keep the existing algorithm and track
the hunks of Complex/aggregates that have been set a second
time.

My first thought was to implement this as an inverted bitmap.
ie,
set
it to 1 for every byte in the complex/aggregate that is set by
the
first
store.  Clear bits for each byte in subsequent stores to the
pieces.
If
the bitmap reaches an empty state, then the initial store is
dead.

Adjusting *ref could work too, but would probably be painful if
the subsequent stores don't happen in a convenient order.
So that was scary easy.  We should have done this a long time ago.

Essentially I call ao_get_ref_base to get the offset/size/max_size
filled in for the first statement.  Those are used to initialize
the live bytes bitfield, as long as max_size != -1.

Then when we have a possible killing statement, we use the ao in a
similar manner to determine which bytes to clear (taking care that
the base is the same between the two references and that in the
killing statement that the size/max_size are the same.).

When all the live bytes are zero then we've killed the original
statement.

It's ~20 lines of code.

I need to pull together some additional tests, but it looks likely
we'll be able to wrap this up easily for gcc-6.

BTW, we had sth like this before but it had both correctness and more
importantly scalability issues.
Really? Do you have a pointer to any kind of discussion? I certainly don't want to re-introduce something that was ultimately abandoned due to unfixable problems.

I don't offhand see any scalability issues -- unless ao_ref_init/ao_ref_base don't scale well. This change doesn't cause any additional looping in dse_possible_dead_store_p (it can only cause us to exit the loop earlier) and the bitmap stuff is cheap because we're going to hit the cache as we clear consecutive ranges of bits.

jeff

Reply via email to