This thread kind of got off into a tangent about solr specifics -- if you 
skip down it's really a question about underlying performance concerns of 
using docvalues vs using stored fields.

: 1.      _version_ never needs to be searchable, thus, indexed=false makes 
sense.

Unless i'm wrong, the version field is involved in "search" contexts 
because of optimistic concurrency - in order for an "updated doc=1 if 
version=42" then under the covers a search is done against hte version 
field --- but since this is a fairly constrained filter, indexed=false 
might still be fine as long as docValues=true because the search can be 
done via a DocValues based filter.

: 4.      Given the above, is using docValues=true for _version_ a good idea?

: My take is a simple “no”.  Since docValues is, in essence, column 
: oriented storage (and can be seen, I think, as an alternate index 
: format), what benefit is to be gained for the _version_ field.  The 

To be clear -- Solr already has code thta depends on having "Doc Values" 
on the version field to deal with max version value in segments (see 
VersionInfo.getVersionFromIndex and VersionInfo.getMaxVersionFromIndex) -- 
but as with any field, that doens't mean you must have 'docValues="true"' 
in your schema, instead the UninvertedReader can be used as long as the 
field is indexed.

But that's really not what Ishan is asking about.  

We know it's possible to use docValues=true && indexed=false on the 
version field -- SOLR-6337 is open to decide if that makes sense in the 
sample configs.  Ishan's question is really about stored=false.

The key bit of context of Ishan's question is updateable docValues 
(SOLR-5944) and if/how it might be usable in Solr for the version field -- 
but one key aspect of doing that would be in ensuring that we can *return* 
the correct version value to user (for optimistic concurrency).  Currently 
that's done with stored fields, but that wouldn't be feasible if we go 
down hte route of updateable docValues, which means we would have to 
"return" the version field from the docValues.

that's where ishan's question about docvalues and performance and disk 
seeks comes from...

What are the downsides in saying "instead of using docvalues and stored 
fields for this this single valued int per doc, we're only going to use 
docvalues & when doing pagination we will return the current value of the 
field to the user from the docvalues" what kind of performance impacts 
come up in that case when you have 100 docs per page(ination)


-Hoss
http://www.lucidworks.com/
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to