Maxim Kuvyrkov wrote: > Hi, > > This patch improves register pressure scheduling (both > SCHED_PRESSURE_WEIGHTED and SCHED_PRESSURE_MODEL) to better estimate number > of available registers. > > At the moment the scheduler does not account for spills in the prologues and > restores in the epilogue, which occur from use of call-used registers. The > current state is, essentially, optimized for case when there is a hot loop > inside the function, and the loop executes significantly more often than the > prologue/epilogue. However, on the opposite end, we have a case when the > function is just a single non-cyclic basic block, which executes just as > often as prologue / epilogue, so spills in the prologue hurt performance as > much as spills in the basic block itself. In such a case the scheduler > should throttle-down on the number of available registers and try to not go > beyond call-clobbered registers. > > The patch uses basic block frequencies to balance the cost of using call-used > registers for intermediate cases between the two above extremes. > > The motivation for this patch was a floating-point testcase on > arm-linux-gnueabihf (ARM is one of the few targets that use register pressure > scheduling by default). >
Does aarch64 enable reg pressure sched by default, or what is the flag to enable it? I'm planing to look at the perf impact of the patch. Thanks, Sebastian