Re: problem found with DiskDocValuesFormat

2013-10-22 Thread Duke DAI
Thanks, Mike. Finally I figured out the root cause. I use thread from Thread-Pool-1 to probe indexes parallelly on multiple collections, but will consume documents by thread from Thread-Pool-2. I hold the same DocValue object reference to get values. After paying attention to thread switch, the pr

Re: problem found with DiskDocValuesFormat

2013-10-21 Thread Michael McCandless
It's perfectly fine, and recommended, to reuse a thread across different queries (ie, use a thread pool in your app, up above Lucene). The ThreadLocals used in SegmentCoreReaders should not interfere or cause problems with that: they can easily be re-used across queries. Maybe you can boil down t

Re: problem found with DiskDocValuesFormat

2013-10-21 Thread Duke DAI
Hi Mike, My scenario, query thread from a ThreadPool will be used to execute query. So thread must have to be reused to handle various queries. Now that SegmentCoreReaders uses ThreadLocal to hold per-thread instance, I think some private variables must belong to the given thread(file offset? I di

Re: problem found with DiskDocValuesFormat

2013-10-21 Thread Michael McCandless
Can you describe what problem you are actually hitting? The purpose of docValuesLocal is to hold the per-Thread instance of each doc values, and re-use it when that thread comes back again asking for the same doc values. Mike McCandless http://blog.mikemccandless.com On Mon, Oct 21, 2013 at 6:

Re: problem found with DiskDocValuesFormat

2013-10-21 Thread Duke DAI
Hi guys, Seems I have the same problem with Lucene45DocValuesFormat, no problem with MemoryDocValuesFormat. The problem I encountered with Lucene4.4 is with DiskDocValuesFormat, no with Lucene42DocValuesFormat. I dig into a little and found the superficial cause. In SegmentCoreReaders, there is a

Re: problem found with DiskDocValuesFormat

2013-08-22 Thread Sean Bridges
Thanks for the answers, and thanks for the changes to load doc values to disk, it will be nice to use a supported codec. Upgrading our indexes is not an option, as they are very large. Sean On Wed, Aug 21, 2013 at 11:15 PM, Robert Muir wrote: > On Thu, Aug 22, 2013 at 1:48 AM, Sean Bridges >

Re: problem found with DiskDocValuesFormat

2013-08-21 Thread Robert Muir
On Thu, Aug 22, 2013 at 1:48 AM, Sean Bridges wrote: > Is there a supported DocValuesFormat that doesn't load all the values into > ram? Not with any current release, but in lucene 4.5 if all goes well, the official implementation will work that way (I spent essentially the last entire week on th

Re: problem found with DiskDocValuesFormat

2013-08-21 Thread Sean Bridges
Is there a supported DocValuesFormat that doesn't load all the values into ram? Our use is case is that we have 16 byte ids for all our documents. We used to store the ids in stored fields, and look up the stored field for each search hit. We got much better performance when we switched to stori

Re: problem found with DiskDocValuesFormat

2013-08-21 Thread Robert Muir
On Wed, Aug 21, 2013 at 11:30 AM, Sean Bridges wrote: > What is the recommended way to use DiskDocValuesFormat in production if we > can't reindex when we upgrade? I'm not going to recommend using any experimental codecs in production, but... 1. with 4.3 jar file: IWC.setCodec(Codec.getDefault()

Re: problem found with DiskDocValuesFormat

2013-08-21 Thread Sean Bridges
What is the recommended way to use DiskDocValuesFormat in production if we can't reindex when we upgrade? Will the 4.4 version of DDVF be backwards compatible, or should we make our own copy of DDVF and give it a different codec name to protect ourselves against incompatible changes? Thanks, Sea

Re: problem found with DiskDocValuesFormat

2013-08-13 Thread Duke DAI
Hi Mike, Thanks for your quick response. All data was newly indexed, so compatibility is not the culprit. Is it possible a multi-thread issue? I use shared IndexReaders between different IndexSearchers. No evidence for this guess because I have many multi-thread test cases and they passed, but t

Re: problem found with DiskDocValuesFormat

2013-08-13 Thread Michael McCandless
DiskDVFormat does not have index back compatibility between minor releases; maybe that's what you are seeing? So, you must fully re-index after any DiskDVFormat field after upgrading ... Only the default formats support index back compatibility between releases. Mike McCandless http://blog.mik

problem found with DiskDocValuesFormat

2013-08-13 Thread Duke DAI
Hi experts, I'm upgrading Lucene 4.4 and trying to use DocValues instead of store field for performance reason. But due to unknown size of index(depends on customer), so I will use DiskDocValuesFormat, especially for some binary field. Then I wrote my customized Codec: final Codec codec = n