I'm following up on a resilient dynamic dispatch discussion kicked off by Slava during a performance team meeting to summarize some key points on public [swift-dev].
It's easy to get sidetracked by the details of dynamic dispatch and various ways to generate code. I suggest approaching the problem by focusing on the ABI aspects and flexibility the ABI affords for future optimization. I'm including a proposal for one specific approach (#3) that wasn't discussed yet. --- #1. (thunk export) The simplest, most flexible way to expose dispatch across resilience boundaries is by exporting a single per-method entry point. Future compilers could improve dispatch and gradually expose more ABI details. Cost: We're forced to export all those symbols in perpetuity. [The cost of the symbols is questionable. The symbol trie should compress the names, so the size may be small, and they should be lazily resolved, so the startup cost should be amortized]. --- #2. (offset export) An alternative approach was proposed by JoeG a while ago and revisited in the meeting yesterday. It involves a client-side vtable offset lookup helper. This allows more opportunity for micro-optimization on the client side. This exposes the isa-based vtable mechanism as ABI. However, it stops short of exposing the vtable layout itself. Guaranteeing vtable dispatch may become a problem in the future because it forces an explosion of metadata. It also has the same problem as #1 because the framework must export a per-method symbol for the dispatch offset. What's worse, the symbols need to be eagerly resolved (AFAIK). --- #3. (method index) This is an alternative that I've alluded to before, but was not discussed in yesterday's meeting. One that makes a tradeoff between exporting symbols vs. exposing vtable layout. I want to focus on direct cost of the ABI support and flexibility of this approach vs. approach #1 without arguing over how to micro-optimize various dispatching schemes. Here's how it works: The ABI specifies a sort function for public methods that gives each one a per-class index. Version availability takes sort precedence, so public methods can be added without affecting other indices. [Apparently this is the same approach we're taking with witness tables]. As with #2 this avoids locking down the vtable format for now--in the future we'll likely optimize it further. To avoid locking all methods into the vtable mechanism, the offset can be tagged. The alternative dispatch mechanism for tagged offsets will be hidden within the class-defining framework. This avoids the potential explosion of exported symbols--it's limited to one per public class. It avoids explosion of metadata by allowing alternative dispatch for some subset of methods. These tradeoffs can be explored in the future, independent of the ABI. --- #3a. (offset table export) A single per-class entry point provides a pointer to an offset table. [It can be optionally cached on the client side]. method_index = immediate { // common per-class method lookup isa = load[obj] isa = isa & @isa_mask offset = load[@class_method_table + method_index] if (isVtableOffset(offset)) method_entry = load[isa + offset] else method_entry = @resolveMethodAddress(isa, @class_method_table, method_index) } call method_entry Cost - client code size: Worst case 3 instructions to dispatch vs 1 instruction for approach #1. Method lookups can be combined, so groups of calls will be more compact. Cost - library size: the offset tables themselves need to be materialized on the framework side. I believe this can be done statically in read-only memory, but that needs to be verified. ABI: The offset table format and tag bit are baked into the ABI. --- #3b. (lazy resolution) Offset tables can be completely localized. method_index = immediate { // common per-class method lookup isa = load[obj] offset = load[@local_class_method_table + method_index] if (!isInitializedOffset(offset)) { offset = @resolveMethodOffset(@class_id, method_index) store [@local_class_method_table + method_index] } if (isVtableOffset(offset)) method_entry = load[isa + offset] else method_entry = @resolveMethodAddress(isa, @class_id, method_index) } call method_entry ABI: This avoids exposing the offset table format as ABI. All that's needed is a symbol for the class, a single entry point for method offset resolution, and a single entry point for non-vtable method resolution. Benefit: The library no longer needs to statically materialize tables. Instead they are initialized lazilly in each client module. Cost: Lazy initialization of local tables requires an extra check and burns some code size. --- Caveat: This is the first time I've thought through approach #3, and it hasn't been discussed, so there are likely a few things I'm missing at the moment. --- Side Note: Regardless of the resilient dispatch mechanism, within a module the dispatch mechanism should be implemented with thunks to avoid type checking classes from other files and improve compile time in non-WMO builds, as Slava requested. -Andy _______________________________________________ swift-dev mailing list swift-dev@swift.org https://lists.swift.org/mailman/listinfo/swift-dev