On Tue, May 28, 2013 at 4:48 PM, Arun Kumar K <arunk...@gmail.com> wrote:
> Hi Guys,

Hi,

> I have been trying to understand DocValues and get some hands on and have
> observed few things.
>
> I have added LongDocValuesField to the documents like:
> doc.add(new LongDocValuesField("id",1));
>
> 1> In 4.0 i saw that there are two versions for docvalues,
>      RAM Resident(using Sources.getSOurces())  & On
> Disk(Sources.getDirectSources()).
>
>      But in 4.2 i get LongDocValues using
> "context.reader().getNumericDocValues(field) ". Which type is this ?
>      If this RAM based then is there any Disk-Based equivalent ?

Indeed, doc values have changed a lot between 4.1 and 4.2. The way doc
values are stored now depends on the DocValuesFormat. For example, the
default format (Lucene42DocValuesFormat) today stores data in memory
while we also have DiskDocValuesFormat (in lucene/codecs) which stores
data on disk.

> 2> Can DocValuesField be used for search ? I coudn't. Did i miss something?
>      "searcher.search(parser.parse("docvaluedfield:value"),100)"

Yes and no. The query parser can't deal with it, but for example, you
could use FieldCacheRangeFilter to build a range query (potentially
matching a single value) on top of doc values. (When a field has doc
values, Fieldcache will automatically use them instead of uninverting
the field). While this will likely be slower for thin ranges, this
should be very fast (probably even faster than a range query based on
the terms dictionary) for large ranges that match many documents.

>      I am able to use for sorting.
>      If possible i want to avoid having a stored field in index with same
> "name" & "Value" of DocValueField of same
>      document and perform search.

While you can do that, I don't recommend it. For example, if you have
5 fields, loading all fields from stored fields requires at most 1
disk seek while loading all fields from doc values requires at least 5
disk seeks for disk-based doc values.

> 3> I have a reader opened on DirectoryReader with the docBaseInParent value
> as 0 (first documents internal ID).
>      Even when i delete the first added document (with internal docID = 0)
> using some query the docBaseInParent is not
>      updated to 1(next documents internal ID). I have committed writer,
> forceMergeDeletes but it's the same.
>      I have also seen getLiveDocs().
>
>     Just curious to know the reasons for not updating the docBase ?

Everything in Lucene is based on the fact that segments are immutable
up to deletes. Starting to mutate internal data such as
docBaseInParent would make the design much more complicated (hence
harder to reason about, to optimize, etc.).

--
Adrien

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to