Re: How can Docvalues so efficient

2016-06-05 Thread Adrien Grand
Le mer. 1 juin 2016 à 05:32, Ting Yao a écrit : > In my opinion, the field data (we call it uninverted index data) can be > stored on disk like in doc id order, when we need > most fields values at a time, is this way the more efficient when the field > datas are not very big? And if it is store

Re: How can Docvalues so efficient

2016-05-31 Thread Ting Yao
Sorry for the delay. And thank you for your answers. Can I understand it like this: Because the doc values are stored on disk, when Searcher gets a few values of a field, then it reads disk to get them. The Lucene stores the start position of *every *field. So when reads disk, it can find the start

Re: How can Docvalues so efficient

2016-05-30 Thread Adrien Grand
When executing queries, Lucene has an abstraction called Scorer, which is responsible for returning matching documents in doc id order. Since doc values are stored on disk in doc id order, reads are sequential. There is an adversary case when few documents match since you might need to jump over la

Re: How can Docvalues so efficient

2016-05-30 Thread Ting Yao
Thank you very much for answering me. But could you explain how Lucene reads the doc values files sequentially? 2016-05-30 18:15 GMT+08:00 Adrien Grand : > Doc values indeed need to read from disk. However, the fact that Lucene > reads the doc values files sequentially (disks perform better at s

Re: How can Docvalues so efficient

2016-05-30 Thread Adrien Grand
Doc values indeed need to read from disk. However, the fact that Lucene reads the doc values files sequentially (disks perform better at sequential access than random access) and that the filesystem cache helps keep hot regions of the doc values files in memory usually helps keep perfermance close

How can Docvalues so efficient

2016-05-30 Thread Ting Yao
Hi all, I am reading Lucene source code recently and we also use the Elastic Search as our search engine. As far as I know, the elastic search performance is pretty good. The elastic search is based on Lucene. So I am wondering that how it can search words so fast when the field data (uninve