Lucene (using 3.5) seems to be caching field values for documents (after they
have been retrieved) and I am hoping someone can provide more information on
how and where exactly the field values are stored.
The table below lists the times (in milliseconds) associated with retrieving
for a set o
armup queries before you swap your search
in to serve user queries to get best performance.
hope that helps
simon
On Fri, Feb 24, 2012 at 10:18 PM, Rose, Stuart J wrote:
>
> Lucene (using 3.5) seems to be caching field values for documents (after they
> have been retrieved) and I am hopi
I've noticed that processes that were previously IO bound (in 3.5) are now CPU
bound (in 4.4) and I expect it is due to the compression/decompression of term
vector fields in 4.4.
It would be nice if users of 4.4 could turn the compression OFF entirely.
-Original Message-
From: Iva
Is there an optimal way to access many document TermVectors (in the same chunk)
consecutively when using the LZ4 termvector compression?
I'm curious to know whether all TermVectors in a single compressed chunk are
decompressed and cached when one TermVector in the same chunk is accessed?
Also w
I'm using Lucene 4.4 with SortedSetDocValuesFacetFields and would like to add
and/or remove CategoryPaths for certain documents in the index.
Basically, as additional sets of docs are added, the CategoryPaths for some of
the previously indexed documents need to changed.
My current testing with
I agree that comparing the BytesRef lengths in an equals() method seems counter
to the purpose of having a BytesRef class.
I'd recommend taking a look at the BytesRefHash which maps BytesRef objects to
unique ids as it 'may' be more efficient than converting to Strings.
Stuart
-Original
Hi Steve,
We leveraged the SpanQuery and Highlighting APIs in 3.5 a couple of years ago
to do this. In order to get accurate doc hits for the types of phrases that we
needed to support search on, we defined a phrase query syntax and then
implemented a span query parser to create a nested struc
Hi Vijay,
...sorting the documents you need to retrieve by docID order first...
means sorting them by their 'document number' which is the value in the
'scoreDoc.doc' field and is the value that the reader takes to 'retrieve' the
document from the index. If you write a comparator to sort the
I've noticed some interesting and unexpected behavior regarding performance of
the Facets aggregation in Lucene 4.4 and am wondering if anyone has come across
this before and can offer insight to potential factors.
In a nutshell, the CachingWrapperFilter results in significant performance
gain