On 9/24/25 1:08 PM, Julian Waters wrote:
Recently I've been picking up and resuming work on making LTO viable for HotSpot. [...] The bigger issue is related to the flatten attribute on the gcc compiler. In short, some G1 code (More specifically void G1ParScanThreadState::trim_queue_to_threshold(uint threshold), void G1ParScanThreadState::steal_and_trim_queue(G1ScannerTasksQueueSet* task_queues) and oop G1ParScanThreadState::copy_to_survivor_space(G1HeapRegionAttr region_attr, oop old, markWord old_mark)) is marked as flatten
The flatten attribute is being used there because the code in question has been found to be very (performance) sensitive to compiler decisions to sometimes not inline something. So apply the big hammer. (Note that some critical path stuff is conditionally noinline in debug builds, because otherwise such builds were also running into excessive inlining, leading to things like "conditional branch out of range" failures.) Using flatten is certainly leading to more inlining than actually needed. An alternative to using flatten would be to try to mark all the critical path stuff always_inline. I found that pretty hard to do, and also brittle, much like Julian is encountering with the flatten + noinline approach. But it might be that some judicious always_inline + noinline + either (but not both) flatten or LTO might work. That was the direction I was going to explore, but haven't found any spare 'tuits to allocate in that direction. Supporting LTO just hasn't percolated up the priority list here.
