This thread kind of got off into a tangent about solr specifics -- if you skip down it's really a question about underlying performance concerns of using docvalues vs using stored fields.
: 1. _version_ never needs to be searchable, thus, indexed=false makes sense. Unless i'm wrong, the version field is involved in "search" contexts because of optimistic concurrency - in order for an "updated doc=1 if version=42" then under the covers a search is done against hte version field --- but since this is a fairly constrained filter, indexed=false might still be fine as long as docValues=true because the search can be done via a DocValues based filter. : 4. Given the above, is using docValues=true for _version_ a good idea? : My take is a simple “no”. Since docValues is, in essence, column : oriented storage (and can be seen, I think, as an alternate index : format), what benefit is to be gained for the _version_ field. The To be clear -- Solr already has code thta depends on having "Doc Values" on the version field to deal with max version value in segments (see VersionInfo.getVersionFromIndex and VersionInfo.getMaxVersionFromIndex) -- but as with any field, that doens't mean you must have 'docValues="true"' in your schema, instead the UninvertedReader can be used as long as the field is indexed. But that's really not what Ishan is asking about. We know it's possible to use docValues=true && indexed=false on the version field -- SOLR-6337 is open to decide if that makes sense in the sample configs. Ishan's question is really about stored=false. The key bit of context of Ishan's question is updateable docValues (SOLR-5944) and if/how it might be usable in Solr for the version field -- but one key aspect of doing that would be in ensuring that we can *return* the correct version value to user (for optimistic concurrency). Currently that's done with stored fields, but that wouldn't be feasible if we go down hte route of updateable docValues, which means we would have to "return" the version field from the docValues. that's where ishan's question about docvalues and performance and disk seeks comes from... What are the downsides in saying "instead of using docvalues and stored fields for this this single valued int per doc, we're only going to use docvalues & when doing pagination we will return the current value of the field to the user from the docvalues" what kind of performance impacts come up in that case when you have 100 docs per page(ination) -Hoss http://www.lucidworks.com/
--------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
