Re: Block tree terms dict & index

Michael McCandless Wed, 01 May 2013 04:31:26 -0700

On Tue, Apr 30, 2013 at 7:57 PM, Beale, Jim (US-KOP) <jim.be...@hibu.com> wrote:


> We've just upgraded to 4.2 from 3.6 and suffered some performance degradation 
> in both indexing and retrieval.  We've had to eliminate compression, even 
> supplying our own NoCompression codec since there doesn't appear to be any 
> built in support for this.  Hopefully we're not overlooking something with 
> the compression.

Customizing your codec components to change or disable compression is
entirely normal... but it's curious you saw such a performance hit
from the compression.  Can you share more details?  Was it from
compressed stored fields or term vectors?  Or both?

> It did reduce the size of our indexes and thus our memory footprint but we 
> lost more on the LZ4 decompression than we gained by having more free memory.

OK.

> DocValues didn't help us either.  We attempted to create an in-memory cache, 
> using a separate index which we closed afterwards and performing a map reduce 
> to speed up access, but we didn't see any significant performance gains.

What were you using DocValues for (and how did you do it in 3.6)?

> What about block tree terms?  What is the use case for that feature?  I 
> noticed that benefits appeared in the spell correction tests but I'm still 
> not clear about how best to employ the codec.  Has anyone had any experience 
> with it?

Block tree terms dict should reduce the time to load the metadata for
a given term, and reduce memory required for the terms index (loaded
fully into RAM).  So term-heavy queries (PK Lookup, direct spell
checker, fuzzy, certain automaton queries) see the most gains.

Mike McCandless

http://blog.mikemccandless.com

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Re: Block tree terms dict & index

Reply via email to