On Fri, 16 May 2025 23:51:39 GMT, John R Rose <jr...@openjdk.org> wrote:
>> Radim Vansa has updated the pull request incrementally with five additional >> commits since the last revision: >> >> - Revert change in array.hpp >> - Revert changes in VerifyRawIndexesTest >> - Improve FieldInfo::print >> - Load constant for SORTED_FIELD_TABLE_THRESHOLD >> - Replace JumpTable with SortedFieldTable > > This change adds smaller "fast-forward" table to accelerate random-access > queries within a larger stream. > > I am not against this change, but I think it should be refactored (either now > or later) into a library that can be used with other streamy data. I'm > thinking of debug-info, dependency lists, line number tables, reloc-info, or > maybe PC descs as future clients of such a thing. > > It could be a query accelerator (index), for any kind of "streamy data" that > can get long, and needs occasional random access. > > It is good to switch to binary search for big tables, while preserving the > speed of linear search for small ones. A similar technique can also be seen > in `PcDescContainer::find_pc_desc_internal`. Both binary-search algorithms > are complicated and difficult to validate. If we had a suitable library, we > could tune it and test it carefully, and use it to simplify field streams, PC > descs, and whatever else we need to work with. > > BTW, it's sometimes useful to add a front-side cache for such searches. This > is a bookmark of the last query point. This helps with queries which are > correlated; you don't re-search from scratch on every step. This is the > function of `last_pc_desc` in the PC desc query; it can accelerate the binary > search. > > The fast-forward table here uses an ad hoc var-int scheme which is delicate > and possibly buggy. > > One way to do without the specialized table is to inject synchronization > markers into the streamy data itself. Note that the null byte is never used > to encode UNSIGNED5 data. Therefore, if you have a stream of UNSIGNED5 > tokens, you can add boundaries encoded null bytes. In the present > application, you could perform binary search on the raw stream pointer, with > each probe first re-synchronizing to the first null byte, and then decoding > with a short sprint of linear access. The placement of null bytes is a > tuning tactic: More null bytes add footprint but decrease the length of the > short sprints. > > I'm not asking for any changes at this point because I have not read the PR > carefully enough. And I don't intend to hold up reviews, but I do want us to > put refactoring on our roadmap. @rose00 While refactoring might be useful, this PR is trying to address a performance regression (in fact by improving the scenario significantly), and I'd prefer that to land in 25 rather than to delay just for the sake of better 'reusability'. And TBH I don't consider binary search algorithm too difficult to validate. While a simple cache might be useful, I don't have a good example to validate its usefulness. And turning a read-only scenario into mutating scenario can turn out badly for multithreaded code, as in https://bugs.openjdk.org/browse/JDK-8180450 . While you suggest 0 byte as synchronization markers, I've tried to synchronize the stream in #24713 and experiments have shown that while it improves things, it was not sufficient in my case. That's why I've abandoned that approach. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24847#issuecomment-2889870320