On Wed, Jan 10, 2024 at 1:45 PM Laszlo Ersek <ler...@redhat.com> wrote:
>
> (+ Andrew)
>
> On 1/10/24 14:09, Laszlo Ersek wrote:
>
> > I think that keeping the depex section read-only is valuable, so I'd
> > rule out #2. I'd also not start with option #1 -- copying the depex to
> > heap memory, just so this optimization can succeed. I'd simply start by
> > removing the optimization, and measuring how much driver dispatch slows
> > down in practice, on various platforms.
> >
> > Can you try this? (I have only build-tested and "uncrustified" it.)
> >
> > The patch removes the EFI_DEP_REPLACE_TRUE handling altogether, plus it
> > CONST-ifies the Iterator pointer (which points into the DEPEX section),
> > so that the compiler catch any possible accesses at *build time* that
> > would write to the write-protected DEPEX memory area.
>
> On a tangent: the optimization in question highlights a more general
> problem, namely that the DXE (and possibly MM/SMM) protocol databases
> are considered slow, for some access patterns.
>
> Edk2 implements those protocol databases with linked lists, where lookup
> costs O(n) operations (average and worst cases). And protocol lookups
> are quite frequent (for example, in depex evaluations, they could be
> considered "particularly frequent").
>
> (1) The "Tasks" wiki page mentions something *similar* (but not the
> same); see
>
> https://github.com/tianocore/tianocore.github.io/wiki/Tasks#datahub--gcd-scalability
>
> The description is: "The DXE core's DataHub and GCD (Global Coherency
> Domain) layers don't scale well as the number of data items gets large,
> since they are based on simple linked lists. Find better data structures."

How large do they usually get? What's the worst case?

>
> The same might apply more or less to the protocol database implementation.
>
> (2) More to the point, Andrew Fish reported earlier that at Apple, they
> had rewritten the DXE protocol database, using the red-black tree
> OrderedCollectionLib that I had contributed previously to edk2 -- and
> they saw significant performance improvements.
>
> So upstreaming that feature to edk2 could be very valuable. (Red-black
> trees have O(log(n)) time cost (worst case) for lookup, insertion and
> deletion, and O(n) cost for iterating through the whole data structure.)

FWIW, can we do better than an RB tree? They're notoriously cache unfriendly...

>
> Let me see if I can find the bugzilla ticket...
>
> Ah, I got it. Apologies, I misremembered: Andrew's comment was not about
> the protocol database, but about the handle database. Here it is:
>
> https://bugzilla.tianocore.org/show_bug.cgi?id=988#c7
>
> (the BZ is still CONFIRMED btw...)
>
> Still, I think it must be related in a way. Namely, an EFI handle exists
> if and only if at least one protocol interface is installed on it. If
> you uninstall the last protocol interface from a handle, then the handle
> is destroyed -- in fact that's the *only* way to destroy a handle, to my
> understanding. See EFI_BOOT_SERVICES.UninstallProtocolInterface() in the
> UEFI spec: "If the last protocol interface is removed from a handle, the
> handle is freed and is no longer valid". Furthermore, calling
> InstallProtocolInterface() and InstallMultipleProtocolInterfaces() is
> how one *creates* new handles.
>
> So given how handles and protocol interfaces are conceptually
> interlinked, an rbtree-based protocol DB might have to maintain multiple
> rbtrees internally (for the ability to search the database quickly with
> different types of "keys"). I don't have a design ready in my mind at
> all (I'm not that familiar with the *current*, list-based implementation
> to begin with!). Upstreaming Apple's (experimental?) code could prove
> very helpful.

Ok, some thoughts:

For the handle database, if we made EFI_HANDLE an integer instead, we
could very easily use something similar to a radix tree (see Linux's
xarray). This would provide O(log(n)) lookups and insertion with a
much higher fan-out, without needing CoreValidateHandle's awful O(n)
lookup every time it gets an EFI_HANDLE. You'd just look up the handle
in the tree and if NULL, it's invalid. [1]

For the protocol database, you'd replace the linked list with a simple
hashtable, hashed by protocol. Something as simple as LIST_ENTRY
mProtocolHashtable[64]; would probably be enough to fan out most of
the problems (I think? How many different protocols are there?)

-- 
Pedro

[1] One could be wary of doing this lookup for every EFI_HANDLE
instead of having the pointer directly. But this is much more
efficient than needing to iterate a linked list to validate a pointer.
Considering an xarray-like radix tree as previously described, two
levels with a fan-out of 64 could describe indices 0 - 4096 (64 * 64),
which is much more efficient than chasing through pointers (worst
case) 4096 times until we find the handle.


-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#113541): https://edk2.groups.io/g/devel/message/113541
Mute This Topic: https://groups.io/mt/103594587/21656
Group Owner: devel+ow...@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-


Reply via email to