On Tue, 10 Jun 2025 14:30:06 GMT, Johan Sjölen <jsjo...@openjdk.org> wrote:
>> But then you'd need create an array of `Pair` in `create_search_table` and >> copy the data into that. You wouldn't need a Supplier at all, just pass the >> array and its length to the `fill` method. If you are worried about the >> virtual calls, I could do that. >> >> Alternatively, we could replace virtual calls with templated `fill` method. >> In that case the Comparator should be a template method as well, since that >> one is even more perf-critical? > > I think you misunderstand me, as what I'm proposing wouldn't requre creating > an array in `create_search_table`. > > I'm saying that you do this: > > ```c++ > // Note: we require the supplier to provide the elements in the final order > as we can't easily sort > // within this method - qsort() accepts only pure function as comparator. > void PackedTableBuilder::fill(u1* table, size_t table_length, Supplier > &supplier) const { > Pair<uint32_t, uint32_t> kvs[4]; > > size_t offset = 0; > size_t len_read = 0; > while (len = supplier.next(kvs, 4)) { > // len tells you how many of the full capacity of kvs was used. > } > } > > > Now each call of `Supplier::next` gives you up to 4 elements, quartering the > amount of virtual calls necessary. > >>Alternatively, we could replace virtual calls with templated fill method. In >>that case the Comparator should be a template method as well, since that one >>is even more perf-critical? > > Sure, you can try that out. > > To be clear, I am only asking: Are virtual calls expensive enough here that > not having them, or amortizing their cost, something that makes the > performance better? The earlier version of this PR was not abstracting out PackedTable* so there were no virtual calls. In both the CCC.java and undisclosed customer reproducer I don't see a significant difference in performance - these involve whole JVM startup, so the efficiency of code with linear complexity is probably under the radar. If we want to optimize the hack out of it - yes, there would be space for that, maybe at the cost of maintainability and/or reusability. TLDR: I don't have a (micro)benchmark that would prove one thing is better than the other. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24847#discussion_r2138174164