Given a non-tokenized field that has DocValues, the primary (maybe even only?) reason for making it stored, seems to be document retrieval. When the goal is to construct documents, the base difference between just returning the stored values and returning both stored and DocValued values seems to be performance: Resolving a non-trivial amount of stored values for each document is mostly a bulk operation, while the DocValued ones is more random access.
In most of our setups, search-results are divided between overviews (classic top-10 or top-20 with most relevant documents) and expanded views (separate page or a result box that changes size). The overviews have few data and the expanded views have more data. The data for overviews needs to be provided quickly (stored), whereas the expanded views are one-document-at-a-time and thus does not have the same time requirements (DocValue speed is fine). As non-trivial space (15% in an index I am investigating) can be saved by doing DocValue without storing, would it be an idea to provide support for retrieving DocValued fields as part of document retrieval? This could be done in different ways: * Only return stored values with fl=*. If a field is referenced explicitly with fl=myfield and is DocValued but not stored, return the DocValued value. * State that DocValued fields, that are not stored, should be returned with a flag: resolvedv=true - Toke Eskildsen, State and University Library, Denmark --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
