On Fri, Mar 07, 2008 at 11:10:53AM +0100, Richard Guenther wrote: > On Fri, Mar 7, 2008 at 11:02 AM, Philipp Marek <[EMAIL PROTECTED]> wrote: > > I wrote some perl scripts to test this. I took a "alldefconfig" i686 > > kernel, let objdump disassemble it, and on "iret", "ret", "ljmp" or "jmp" > > with a 4 byte address I store the last few bytes. Another script goes > > through these block-ends, and estimates the number of bytes saved, if > > identical sequences get changed into a 5 byte opcode (jump with 32bit > > address). > > Sounds like what -frtl-abstract-sequences is trying to do.
Not exactly. AFAIK -frtl-abstract-sequences works at a function level, only same sequences within one function are abstracted. The above is a whole program optimization instead, and affects only tail sequences. You need to be very careful with it, as if there are any jumps into the middle of the to be abstracted tail sequences, you can't abstract them or would need to adjust also the jumps into them (if possible, which not always is). Also the tail sequence shouldn't contain any PC relative references (say you can't merge this was movl (.+0xabcd)(%rip), %eax ret which is byte identical, yet would reference different variables), but even jmp argument is relative, not absolute. You can't do this kind of optimization in the linker, as that's too late, many relocations are already relocated and lost during assembly and you need to understand the instructions anyway. Doing it in GCC would be terribly costly, if that would mean expanding everything into RTL, running all RTL passes and instead of emitting them, remember the pre-final RTL sequence for each emitted function, then do an IPA pass over all the sequences. Perhaps doing that in assembler, combined with --combine so that assembler can see everything together... Jakub