Hi Johan. I've just read the whole of this thread.
I didn't quite understand your original point (2) that "token-based suffix scanning will not be as fast as byte-based suffix scanning". Sure it won't, but is there any reason you mentioned suffix scanning there specifically? The same is true of prefix scanning, of course. And both of them could be fast enough, I assume, if you disable the hash calculation in the "get token" callbacks like you were talking about. But I don't think that necessarily affects the main point. It looks like you've thoroughly investigated using a token based approach. Thank you for doing so. My initial feeling that it was worth investigating was in the hope that you might find some fairly straightforward and self-contained modification to the existing token-handling layer. I think the result of this investigation, in which you needed to add token-fetch-backwards callbacks and so on, shows that this approach is too complex. I don't want to see a complex implementation. Therefore I support your inclination to abandon that approach and use the byte-wise approach instead. - Julian