On Tue, Apr 9, 2013 at 9:06 AM, Wei Wang <welshw...@gmail.com> wrote:
> Thanks for the hint. Could you point to some Codec that might do this for > some types, even just as an side effect as you mentioned? It will be > helpful to have something to start with. > Have a look at diskdv/ codec in the codecs/ module. Its a lot simpler than the default codec because it doesnt have the "tradeoff speed for space" performance hacks of the default codec. It might already do something thats good enough for your needs. > > And could you elaborate a bit more for "the facet on tons of sparse > fields"? I just got a vague idea from the comments. > Look at lucene/facet module. As opposed to applications like solr and elasticsearch which would build fieldcaches/docvalues/whatever on hundreds of "fields", I think this one uses just a single binary docvalue field to implement ordinal storage across all "fields" (i think it calls them dimensions or something else). Of course you can simulate this yourself with other approaches too.