On Wed, Jan 10, 2024 at 1:45 PM Laszlo Ersek <ler...@redhat.com> wrote: > > (+ Andrew) > > On 1/10/24 14:09, Laszlo Ersek wrote: > > > I think that keeping the depex section read-only is valuable, so I'd > > rule out #2. I'd also not start with option #1 -- copying the depex to > > heap memory, just so this optimization can succeed. I'd simply start by > > removing the optimization, and measuring how much driver dispatch slows > > down in practice, on various platforms. > > > > Can you try this? (I have only build-tested and "uncrustified" it.) > > > > The patch removes the EFI_DEP_REPLACE_TRUE handling altogether, plus it > > CONST-ifies the Iterator pointer (which points into the DEPEX section), > > so that the compiler catch any possible accesses at *build time* that > > would write to the write-protected DEPEX memory area. > > On a tangent: the optimization in question highlights a more general > problem, namely that the DXE (and possibly MM/SMM) protocol databases > are considered slow, for some access patterns. > > Edk2 implements those protocol databases with linked lists, where lookup > costs O(n) operations (average and worst cases). And protocol lookups > are quite frequent (for example, in depex evaluations, they could be > considered "particularly frequent"). > > (1) The "Tasks" wiki page mentions something *similar* (but not the > same); see > > https://github.com/tianocore/tianocore.github.io/wiki/Tasks#datahub--gcd-scalability > > The description is: "The DXE core's DataHub and GCD (Global Coherency > Domain) layers don't scale well as the number of data items gets large, > since they are based on simple linked lists. Find better data structures."
How large do they usually get? What's the worst case? > > The same might apply more or less to the protocol database implementation. > > (2) More to the point, Andrew Fish reported earlier that at Apple, they > had rewritten the DXE protocol database, using the red-black tree > OrderedCollectionLib that I had contributed previously to edk2 -- and > they saw significant performance improvements. > > So upstreaming that feature to edk2 could be very valuable. (Red-black > trees have O(log(n)) time cost (worst case) for lookup, insertion and > deletion, and O(n) cost for iterating through the whole data structure.) FWIW, can we do better than an RB tree? They're notoriously cache unfriendly... > > Let me see if I can find the bugzilla ticket... > > Ah, I got it. Apologies, I misremembered: Andrew's comment was not about > the protocol database, but about the handle database. Here it is: > > https://bugzilla.tianocore.org/show_bug.cgi?id=988#c7 > > (the BZ is still CONFIRMED btw...) > > Still, I think it must be related in a way. Namely, an EFI handle exists > if and only if at least one protocol interface is installed on it. If > you uninstall the last protocol interface from a handle, then the handle > is destroyed -- in fact that's the *only* way to destroy a handle, to my > understanding. See EFI_BOOT_SERVICES.UninstallProtocolInterface() in the > UEFI spec: "If the last protocol interface is removed from a handle, the > handle is freed and is no longer valid". Furthermore, calling > InstallProtocolInterface() and InstallMultipleProtocolInterfaces() is > how one *creates* new handles. > > So given how handles and protocol interfaces are conceptually > interlinked, an rbtree-based protocol DB might have to maintain multiple > rbtrees internally (for the ability to search the database quickly with > different types of "keys"). I don't have a design ready in my mind at > all (I'm not that familiar with the *current*, list-based implementation > to begin with!). Upstreaming Apple's (experimental?) code could prove > very helpful. Ok, some thoughts: For the handle database, if we made EFI_HANDLE an integer instead, we could very easily use something similar to a radix tree (see Linux's xarray). This would provide O(log(n)) lookups and insertion with a much higher fan-out, without needing CoreValidateHandle's awful O(n) lookup every time it gets an EFI_HANDLE. You'd just look up the handle in the tree and if NULL, it's invalid. [1] For the protocol database, you'd replace the linked list with a simple hashtable, hashed by protocol. Something as simple as LIST_ENTRY mProtocolHashtable[64]; would probably be enough to fan out most of the problems (I think? How many different protocols are there?) -- Pedro [1] One could be wary of doing this lookup for every EFI_HANDLE instead of having the pointer directly. But this is much more efficient than needing to iterate a linked list to validate a pointer. Considering an xarray-like radix tree as previously described, two levels with a fan-out of 64 could describe indices 0 - 4096 (64 * 64), which is much more efficient than chasing through pointers (worst case) 4096 times until we find the handle. -=-=-=-=-=-=-=-=-=-=-=- Groups.io Links: You receive all messages sent to this group. View/Reply Online (#113541): https://edk2.groups.io/g/devel/message/113541 Mute This Topic: https://groups.io/mt/103594587/21656 Group Owner: devel+ow...@edk2.groups.io Unsubscribe: https://edk2.groups.io/g/devel/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-