I use stored fields to load values for the following use cases: - to return per-document values as is, requested by the user - similar to listing DB columns you are interested in, in a "select ..." clause. - to perform aggregate function calculations while forming the result set (if requested). - for group-by type queries (would like to switch to the native grouping API, but don't think it supports grouping on multiple fields, or aggregate functions). - and finally, as I mentioned - to sort search results, also when requested.
Evidently, even for simple queries that don't require any of the post-processing above but ask for a set of values from each document, there's still non-trivial amount of disk activity... hence, I started second-guessing the implementation. On Fri, Apr 4, 2014 at 3:00 PM, Uwe Schindler <u...@thetaphi.de> wrote: > Hi, > > What are you doing with the stored fields? They are not deprecated and > also not really slow, unless you scan over millions of documents in random > access order. To display serach results, DocValues are of no use. > > Uwe > > ----- > Uwe Schindler > H.-H.-Meier-Allee 63, D-28213 Bremen > http://www.thetaphi.de > eMail: u...@thetaphi.de > > > > -----Original Message----- > > From: Vitaly Funstein [mailto:vfunst...@gmail.com] > > Sent: Friday, April 04, 2014 9:44 PM > > To: java-user@lucene.apache.org > > Subject: Stored fields and OS file caching > > > > I have heard here that stored fields don't work well with OS file > caching. > > Could someone elaborate on why that is? I am using Lucene 4.6 and we do > > use stored fields but not doc values; it appears most of the benefit > from the > > latter comes as improvement in sorting performance, and I don't actually > use > > Lucene for sorting at all; rather, it's done on a post-processing basis, > based on > > stored field values (in a nutshell, the reason for this is Lucene's > inability to tell > > apart terms that are empty strings vs. a missing value, resulting in > unstable > > sort order on such fields). > > > > I am not sure if switching to using doc values fields from stored fields > entirely > > would help leverage OS file cache better... what worries me is that when > > processing queries requesting multiple values from the document, doc > value > > fields could cause multiple disk seeks to fetch values for each field, as > > opposed to just one with stored fields. > > > > Am I way off in my understanding of how this works? Any guidelines, as > > general as they may be, are appreciated. > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > >