Hi Alex, Indeed, one or several (the number depends on the size of your documents) documents need to be fully decompressed in order to read a single field of a single document.
Regarding the stored fields visitor, the default one doesn't return STOP when the field has been found because other fields with the same name might be stored further in the stream of stored fields (in case of a multivalued field). If you know that you have a single field value, you can write your own field visitor that will return STOP after the first value has been read. As you noted, this probably has less impact on performance than the first point that you raised. The default stored fields visitor is rather targeted at large indices where compression helps save disk space and can also make stored fields retrieval faster since a larger portion of the stored fields can fit in the filesystem cache. However, if your index is small and fully fits in the filesystem cache, this stored fields format might indeed have non-negligible overhead. On Wed, Apr 9, 2014 at 9:17 PM, Alex Parvulescu <alexparvule...@apache.org> wrote: > Hi, > > I was investigating some performance issues and during profiling I noticed > that there is a significant amount of time being spent decompressing fields > which are unrelated to the actual field I'm trying to load from the lucene > documents. In our benchmark doing mostly a simple full-test search, 40% of > the time was lost in these parts. > > My code does the following: reader.document(id, Set(":path")).get(":path"), > and this is where the fun begins :) > I noticed 2 things, please excuse the ignorance if some of the things I > write here are not 100% correct: > > - all the fields in the document are being decompressed prior to applying > the field filter. We've noticed this because we have a lot of content > stored in the index, so there is an important time lost around > decompressing junk. At one point I tried adding the field first, thinking > this will save some work, but it doesn't look like it's doing much. > Reference code, the visitor is only used at the very end. [0] > > - second, and probably of a smaller impact would be to have the > DocumentStoredFieldVisitor return STOP when there are no more fields in the > visitor to visit. I only have one, and it looks like it will #skip through > a bunch of other stuff before finishing a document. [1] > > thanks in advance, > alex > > > [0] > https://svn.apache.org/viewvc/lucene/dev/trunk/lucene/core/src/java/org/apache/lucene/codecs/compressing/CompressingStoredFieldsReader.java?view=markup#l364 > > [1] > https://svn.apache.org/viewvc/lucene/dev/trunk/lucene/core/src/java/org/apache/lucene/document/DocumentStoredFieldVisitor.java?view=markup#l100 -- Adrien --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org