Hello Jakub!

>> When wouldn't that possible? My script currently splits on an
>> instruction-level -- although I would see no problem that some branch
>> jumps into a "half" opcode of another branch, if the byte sequence
>> matches.
>
> Consider:
> 00000000 <bar>:
>        0:       b8 a4 00 00 00          mov    $0xa4,%eax
>        5:       ba fc 04 00 00          mov    $0x4fc,%edx
>        a:       f7 e2                   mul    %edx
>        c:       05 d2 04 00 00          add    $0x4d2,%eax
>       11:       c3                      ret
>         ...
>
> 00020012 <foo>:
>    20012:       39 d2                   cmp    %edx,%edx
>    20014:       75 07                   jne    2001d <foo+0xb>
>    20016:       ba fc 04 00 00          mov    $0x4fc,%edx
>    2001b:       f7 e2                   mul    %edx
>    2001d:       05 d2 04 00 00          add    $0x4d2,%eax
>    20022:       c3                      ret
That's not an example of jumping into the *middle* of an instruction.

I'd think about something like (please ignore the wrong addresses, just
copy/paste):

       5:       ba fc 04 00 00          mov    $0x4fc,%edx
--->   a:       f7 e2                   mul    %edx
       c:       05 d2 04 00 00          add    $0x4d2,%eax
      11:       c3                      ret
        ...

   20016:       ba 00 ba f7 e2          mov    $0xe2f7ba00,%edx
   2001b:       f7 e2                   mul    %edx
   2001d:       05 d2 04 00 00          add    $0x4d2,%eax
   20022:       c3                      ret
Now we could place a jump at 0x4, which goes to 0x20018 - and should work,
shouldn't it?


> If you merge the mov/mul/add/ret sequences by replacing the foo tail
> sequence with jmp bar+5, then the jne will branch to wrong place, or
> if you try to adjust it, it is too far to reach the target.
Yes. But see below.

>> > but even jmp argument is relative, not absolute.
>> That's why I take only jumps with 32bit arguments - these are absolute.
>
> No, they are relative.
>
> 00000000 <bar>:
>        0:       e9 05 00 02 00          jmp    2000a <foo>
>        5:       e9 00 00 02 00          jmp    2000a <foo>
>         ...
>
> 0002000a <foo>:
>    2000a:       90                      nop
>
> See how they are encoded.
Yes, ok.
But how does the compiler encode jumps? Via symbols, and the correct
destination address gets put there by the linker. If there are
(sizeof(void*)) operand bytes, there's enough space for *any* jump.

(Is there some way for the linker to compress small jumps later?)

> BTW, have you tried to compile the whole kernel with --combine, or at
> least
> e.g. each kernel directory with --combine?  I guess that will give you
> bigger savings than 30K.  Also, stop defining inline to inline
> __attribute__((always_inline)), I think Ingo also added such patch
> recently
> and it saved 120K.
No ... but these are independent changes.

I'm currently compiling with --combine.


Regards,

Phil


-- 
Versioning your /etc, /home or even your whole installation?
             Try fsvs (fsvs.tigris.org)!

Reply via email to