On 11/22/2013, 8:04 PM, dxq wrote:
fixing SMS, do you mean that we only modify the SMS pass?
You don't need loop unrolling when you have a good software pipelining
and loop vectorization. A good software pipelining can see through any
number of iterations and has no problems with code cache locality in
comparison with unrolling. A good SFP is a perfect SFP (e.g. resource
constraints software piplening) which can deal with any loop bodies.
Unfortunately modulo scheduling deals with one BB loop body only (still
it covers most cases after if-conversion, it is easy and can be good for
supporting hardware with rotating regs file). Even if the current SMS
implementation pitfalls are fixed, there are still cases when unrolling
could be beneficial. So probably your approach is the best what can be
done for now. Also if you manage to implement the infrastructure with
copying/backup, it could be useful for a perfect SFP implementation.
Still I think we need long-term strategy for SFP in GCC.
if so, the problem we have to solve:
* how to make unroll and sms work together? calling unroll pass in sms,
but it would be needed more passes such as web, and it's perfect to rerun
all the passes between unroll and sms.
* unroll and web pass exsit in gcc, however gcc's passes only work for a
compilation unit, function, rather than a smaller unit we expect, loop.
that's why we copy all global information, and rerun the passes between the
unroll and sms sevral times.
* if we need try more unroll factors, copying is also needed. the backup
exsits in single pass, so it would not be purged by GGC. but, if memory
consuming is huge, is there any risk for the other passes? from my
experience, when disable GGC between unroll and sms, with ggc_min_expand =
100 ggc_min_heap = 20480, and compile a big file, gcc crashes down.
That's what I can think of. you know, it's a very big and hard work. do
you have any suggestions about our current solution?