On 07/08/2013 02:52 PM, Marc Glisse wrote:
I wonder why you implemented this as a separate pass instead of adding
it to tree-ssa-forwprop. demote_cast is (or should be) a special case of
combine_conversions, so it would be nice to avoid the code duplication
(there is also a version in fold-const.c). demote_into could be called
from roughly the same place as simplify_conversion_from_bitmask. And you
could reuse get_prop_source_stmt, can_propagate_from,
remove_prop_source_from_use, etc.
That's a real good question; I find myself looking a lot at the bits in
forwprop and I'm getting worried it's on its way to being an
unmaintainable mess. Sadly, I'm making things worse rather than better
with my recent changes. I'm still hoping more structure will become
evident as I continue to work through various improvements.
I find myself also pondering these bits in a code motion model; what
hasn't become clear yet is the driving motivation to show why thinking
about this as a code motion problem is interesting.
Conceptually we can hoist casts to their earliest possible point and
sink them to their latest possible point. What are the benefits of
those transformations and is there anything inherently good about
actually moving the typecasts as opposed to just realizing the casts are
in the IL and optimizing appropriately.
ie, often I see the hoisting/sinking code bring a series of casts
together into straighline code which then gets optimized. *BUT* is
there anything inherently better/easier with having them in straightline
code. We can walk the use->def chains and recover the same information.
If that's not happening, then that points to a failing in our optimizers.
Floating out there is the hope that there's a set of canonicalization
rules to guide us where to place the typecasts. ie, is it generally
better to have
(T) (a) OP (T) b
Or is it better to have
(T) (a OP b)
[ Assuming they're semantically the same. ]
Is it dependent on T and how T relates to the underlying target? Are
the guidelines likely the same for logicals vs arithmetic, etc?
If I understand, the main reason is because you want to go through the
statements in reverse order, since this is the way the casts are being
propagated (would forwprop also work, just more slowly, or would it miss
opportunities across basic blocks?).
SSA_NAME_DEF_STMT can cross block boundaries.
I have some trouble understanding why something as complicated as
build_and_add_sum (which isn't restricted to additions by the way) is
needed. Could you add a comment to the code explaining why you need to
insert the statements precisely there and not for instance just before
their use? Is that to try and help CSE?
Hmm, I thought I had suggested that routine get renamed during an
internal, informal review of Kai's work.
I have added an additional early pass "typedemote1" to this patch for
simple cases types can be easily sunken into statement without special
unsigned-cast for overflow-case. Jakub asked for it. Tests have
shown that this pass does optimizations in pretty few cases. As
example in testsuite see for example pr46867.c testcase.
The second pass (I put it behind first vrp pass to avoid
testcase-conflicts) uses 'unsigned'-type casts to avoid undefined
overflow behavior. This pass has much more hits in standard code,
I assume that, when the pass is done, we can consider removing the
corresponding code from the front-ends? That would increase the hits ;-)
Kai and I have briefly touched on this, mostly in the context of
removing bits from fold-const.c rather than the front-ends proper.
jeff