Hi Adrein, Thanks for clarifying the things. I have some doubts regarding sorting : > > While you can do that, I don't recommend it. For example, if you have > 5 fields, loading all fields from stored fields requires at most 1 > disk seek while loading all fields from doc values requires at least 5 > disk seeks for disk-based doc values.
1> I am assuming those mentioned 5 fields are sortable fields upon which sorting is done. In my understanding, loading stored fields takes 1 disk seek for finding file pointer & 1 disk seek for getting all those fields. Since different file is maintained for a particular doc value field. We get 5 disk seeks + 1 disk seek for file pointer. If we have only one sortable field , which could be better ? I guess no diff. Also, I vaguely remember that there is some performance loss for sorting based on string in lucene 4.0 Then, will the decision change for String field or based on type of field ? 2> Also, In my understanding, if we need to use parser based queries for docvalues, we need to have a storedfield for a doc with same name & value of the doc's docvalue. Even term queries won't work. Am i right here? Thanks, Arun On 28-May-2013, at 8:31 PM, Adrien Grand <jpou...@gmail.com> wrote: > On Tue, May 28, 2013 at 4:48 PM, Arun Kumar K <arunk...@gmail.com> wrote: >> Hi Guys, > > Hi, > >> I have been trying to understand DocValues and get some hands on and have >> observed few things. >> >> I have added LongDocValuesField to the documents like: >> doc.add(new LongDocValuesField("id",1)); >> >> 1> In 4.0 i saw that there are two versions for docvalues, >> RAM Resident(using Sources.getSOurces()) & On >> Disk(Sources.getDirectSources()). >> >> But in 4.2 i get LongDocValues using >> "context.reader().getNumericDocValues(field) ". Which type is this ? >> If this RAM based then is there any Disk-Based equivalent ? > > Indeed, doc values have changed a lot between 4.1 and 4.2. The way doc > values are stored now depends on the DocValuesFormat. For example, the > default format (Lucene42DocValuesFormat) today stores data in memory > while we also have DiskDocValuesFormat (in lucene/codecs) which stores > data on disk. > >> 2> Can DocValuesField be used for search ? I coudn't. Did i miss something? >> "searcher.search(parser.parse("docvaluedfield:value"),100)" > > Yes and no. The query parser can't deal with it, but for example, you > could use FieldCacheRangeFilter to build a range query (potentially > matching a single value) on top of doc values. (When a field has doc > values, Fieldcache will automatically use them instead of uninverting > the field). While this will likely be slower for thin ranges, this > should be very fast (probably even faster than a range query based on > the terms dictionary) for large ranges that match many documents. > >> I am able to use for sorting. >> If possible i want to avoid having a stored field in index with same >> "name" & "Value" of DocValueField of same >> document and perform search. > > While you can do that, I don't recommend it. For example, if you have > 5 fields, loading all fields from stored fields requires at most 1 > disk seek while loading all fields from doc values requires at least 5 > disk seeks for disk-based doc values. > >> 3> I have a reader opened on DirectoryReader with the docBaseInParent value >> as 0 (first documents internal ID). >> Even when i delete the first added document (with internal docID = 0) >> using some query the docBaseInParent is not >> updated to 1(next documents internal ID). I have committed writer, >> forceMergeDeletes but it's the same. >> I have also seen getLiveDocs(). >> >> Just curious to know the reasons for not updating the docBase ? > > Everything in Lucene is based on the fact that segments are immutable > up to deletes. Starting to mutate internal data such as > docBaseInParent would make the design much more complicated (hence > harder to reason about, to optimize, etc.). > > -- > Adrien > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org