Re: Cassandra in memory key index

Pavel Yaskevich Fri, 08 Jun 2012 13:04:27 -0700

Yeah, that is why I wrote "if possible" :) Also, does FST provide a predecessor 
lookup function, wasn't clear from the blog post?


On Friday 8 June 2012 at 22:53, Jason Rutherglen wrote:

> Yeah that's fine, however if there isn't a Java implementation that's a lot
> of extra work. The FST approach should be a clear quick and easy win. The
> current system of in heap keys and a binary search is what the FST replaced
> in Lucene. There are plenty of references to the improvement.
> 
> On Fri, Jun 8, 2012 at 3:50 PM, Pavel Yaskevich <[email protected] 
> (mailto:[email protected])> wrote:
> 
> > I would vote, if possible, to compare it with y-fast trie [1] (it doesn't
> > seem to be available java implementation unfortunately) by means of memory
> > efficiency and lookup performance. As we use big integer tokens the main
> > benefit from that trie could be O(log log M) predecessor lookup and compact
> > in-memory size.
> > 
> > [1] https://en.wikipedia.org/wiki/Y-fast_trie
> > 
> > Best Regards
> > --
> > Pavel Yaskevich
> > 
> > 
> > On Friday 8 June 2012 at 22:19, Jason Rutherglen wrote:
> > 
> > > Ok looks like the IndexSummary encapsulates everything, I can start with
> > > hacking that.
> > > 
> > > On Fri, Jun 8, 2012 at 11:50 AM, Jason Rutherglen <
> > > [email protected] (mailto:[email protected])> wrote:
> > > 
> > > > The Cassandra integration is probably beyond the time I have available.
> > > > If the locations in the code that need to be rewritten to use the FST
> > > > 
> > > 
> > > 
> > 
> > are
> > > > known, and a patch simply 'plugs-in' the FST, that would be much
> > > 
> > 
> > easier.
> > > > Eg, I don't know how Cassandra stores the current key index for
> > > 
> > 
> > example...
> > > > 
> > > > Basically I can write FST serializing, deserializing, and key lookup
> > code
> > > > fairly easy by basing it on Lucene's terms dict.
> > > > 
> > > > 
> > > > On Fri, Jun 8, 2012 at 6:00 AM, Radim Kolar <[email protected] 
> > > > (mailto:[email protected]) (mailto:
> > [email protected] (mailto:[email protected]))> wrote:
> > > > 
> > > > > 
> > > > > If you are interested I can help, I used the FST on a Hadoop project
> > > > > > to implement a fast map side range join.
> > > > > 
> > > > > 
> > > > > create JIRA item with patch attached, i will test it.

Re: Cassandra in memory key index

Reply via email to