On 11/10/2015 11:33 PM, Nathan Sidwell wrote:
I've committed this patch to trunk.  It implements a partitioning
optimization for a loop partitioned over both vector and worker axes.
We can elide the inner vector partitioning state propagation, if there
are no intervening instructions in the worker-partitioned outer loop
other than the forking and joining.  We simply execute the worker
propagation on all vectors.

Patch LGTM, although I wonder if you really need the extra option rather than just optimize.

I've been unable to introduce a testcase for this. The difficulty is we
want to check an rtl dump from the acceleration compiler, and there
doesn't  appear to be existing machinery for that in the testsuite.
Perhaps something to be added later?

What's the difficulty exactly? Getting a dump should be possible with -foffload=-fdump-whatever, does the testsuite have a problem finding the right filename?


Bernd

Reply via email to