On Oct 27, 2009, at 11:00, ext Aurelien Jarno wrote: > juha.riihim...@nokia.com a écrit : >> On Oct 27, 2009, at 10:39, ext Aurelien Jarno wrote: >> >>> On Sat, Oct 24, 2009 at 03:19:04PM +0300, juha.riihim...@nokia.com >>> wrote: >>>> From: Juha Riihimäki <juha.riihim...@nokia.com> >>>> >>>> RM load/store multiple instructions can be slightly optimized by >>>> loading the register offset constant into a variable outside the >>>> register loop and using the preloaded variable inside the loop >>>> instead >>>> of reloading the offset value to a temporary variable on each loop >>>> iteration. This causes less TCG ops to be generated for a ARM load/ >>>> store multiple instruction if there are more than one register >>>> accessed, otherwise the number of generated TCG ops is the same. >>>> >>>> Signed-off-by: Juha Riihimäki <juha.riihim...@nokia.com> >>>> Acked-by: Laurent Desnogues <laurent.desnog...@gmail.com> >>> This patch breaks, the boot of an arm kernel, as tmp2 is used >>> elsewhere >>> within this code path. >> >> True, I just noticed that as well. This is because the resource leak >> patch >> was refactored to utilize tmp2 inside the loop as well. I just sent a >> new >> revision of this patch that uses tmp3 for th constant value. >> >>> OTOH, while it reduce the number of TCG ops, that should not impact >>> the >>> generated host asm code, as most (all ?) targets are able to add a >>> small constant value to a register in one instruction. >> >> This is true, but I still think it provides a small speed gain as >> there are >> less TCG ops to be processed when generating host code...? > > It means less TCG ops, but one more temp variable, therefore if > there is > a gain, I don't think it is something even measurable. > > OTOH it makes the code a bit more complex to read. I am not really > opposed to this patch (and the other patches of the same kind), but I > will need some more arguments to convince me.
Shouldn't the amount of temp variables stay the same? tcg_gen_addi_i32 will internally allocate a temporary variable to hold the integer parameter. The only difference is that the temporary stays alive during the loop instead of being allocated/deallocated during each iteration. But I agree, the performance gain is probably quite small. Regards, Juha