Yeah, that is why I wrote "if possible" :) Also, does FST provide a predecessor lookup function, wasn't clear from the blog post?
On Friday 8 June 2012 at 22:53, Jason Rutherglen wrote: > Yeah that's fine, however if there isn't a Java implementation that's a lot > of extra work. The FST approach should be a clear quick and easy win. The > current system of in heap keys and a binary search is what the FST replaced > in Lucene. There are plenty of references to the improvement. > > On Fri, Jun 8, 2012 at 3:50 PM, Pavel Yaskevich <pove...@gmail.com > (mailto:pove...@gmail.com)> wrote: > > > I would vote, if possible, to compare it with y-fast trie [1] (it doesn't > > seem to be available java implementation unfortunately) by means of memory > > efficiency and lookup performance. As we use big integer tokens the main > > benefit from that trie could be O(log log M) predecessor lookup and compact > > in-memory size. > > > > [1] https://en.wikipedia.org/wiki/Y-fast_trie > > > > Best Regards > > -- > > Pavel Yaskevich > > > > > > On Friday 8 June 2012 at 22:19, Jason Rutherglen wrote: > > > > > Ok looks like the IndexSummary encapsulates everything, I can start with > > > hacking that. > > > > > > On Fri, Jun 8, 2012 at 11:50 AM, Jason Rutherglen < > > > jason.rutherg...@gmail.com (mailto:jason.rutherg...@gmail.com)> wrote: > > > > > > > The Cassandra integration is probably beyond the time I have available. > > > > If the locations in the code that need to be rewritten to use the FST > > > > > > > > > > > > > > are > > > > known, and a patch simply 'plugs-in' the FST, that would be much > > > > > > > easier. > > > > Eg, I don't know how Cassandra stores the current key index for > > > > > > > example... > > > > > > > > Basically I can write FST serializing, deserializing, and key lookup > > code > > > > fairly easy by basing it on Lucene's terms dict. > > > > > > > > > > > > On Fri, Jun 8, 2012 at 6:00 AM, Radim Kolar <h...@filez.com > > > > (mailto:h...@filez.com) (mailto: > > h...@filez.com (mailto:h...@filez.com))> wrote: > > > > > > > > > > > > > > If you are interested I can help, I used the FST on a Hadoop project > > > > > > to implement a fast map side range join. > > > > > > > > > > > > > > > create JIRA item with patch attached, i will test it.