Giuliano, I asked for some documentation off you related to the RTL passes. Not sure if you are just hitting bottlenecks in all_rtl_passes or ipa_passes functions but it seems that the SSA trees and cfgloop.c and cfgloop.h files optimization passes would still be a issue. Particularly after the final GIMPLE and IPA passes it would be great to multi-thread and be able to walk the dominator trees multi-threaded.
Not sure if you've looked as what seem to be issues here and was wondering if these are happening in your profiling still. GENERIC to GIMPLE may also be a issue in gimipfly but less than those. Again I understand if's out of scope but it would be great if you have a current profile graph that I can see. It would give me an idea of where to start working outside of the core GIMPLE optimizations passes your working on. Huge thanks and again good luck, Nick P.S. Don't worry if you don't it would just be nice to have and rewriting multi-threaded memory allocation is not easy and even more so with the shared state between compiler passes.