On Wed, Jan 29, 2025 at 04:02:34PM -0800, Andrii Nakryiko wrote:
> On Tue, Jan 28, 2025 at 6:02 PM Josh Poimboeuf <jpoim...@kernel.org> wrote:
> I'm not sure about this chunked lookup approach for arbitrary user
> space applications. Those executable sections can be a) big and b)
> discontiguous. E.g., one of the production binaries I looked at. Here
> are its three main executable sections:
> 
> ...
>   [17] .bolt.org.text    PROGBITS         000000000b00e640  0ae0d640
>        0000000011ad621c  0000000000000000  AX       0     0     64
> ...
>   [48] .text             PROGBITS         000000001e600000  1ce00000
>        0000000000775dd8  0000000000000000  AX       0     0     2097152
>   [49] .text.cold        PROGBITS         000000001ed75e00  1d575e00
>        00000000007d3271  0000000000000000  AX       0     0     64
> ...
> 
> Total text size is about 300MB:
> >>> 0x0000000000775dd8 + 0x00000000007d3271 + 0x0000000011ad621c
> 312603237
> 
> Section #17 ends at:
> 
> >>> hex(0x0000000011ad621c + 0x000000000b00e640)
> '0x1cae485c'
> 
> While .text starts at 000000001e600000, so we have a gap of ~28MB:
> 
> >>> 0x000000001e600000 - 0x1cae485c
> 28424100
> 
> So unless we do something more clever to support multiple
> discontiguous chunks, this seems like a bad fit for user space.

Nothing clever needed, we could just have multiple sframe sections, each
one with a pointer to its text segment.  That would also have the
benefit of allowing the sframe data to be much more compact for the
noncontiguous cases.

> I think having all this just binary searchable is already a big win
> anyways and should be plenty fast, no?

Sframe is trying to compete with frame pointers which are MUCH faster.
3-4x faster in my testing, not including the page faults (which tend to
only affect performance in the very beginning).

-- 
Josh

Reply via email to