Re: Stored fields: decompression slows down in my scenario ... any idea for a workaround?

2013-06-24 Thread Mathias Lux
Hi! Thanks!! I'll try the DocValues for sure, and of course the smaller chunk size. Just to add up on the number of bytes stored: it's for instance 72 bytes for CEDD, ~96 for JCD, 64 bytes for OpponentHistogram, etc. and there is 0 wrote: > Uwe, > I think Mathias was talking about the case with ma

Re: Stored fields: decompression slows down in my scenario ... any idea for a workaround?

2013-06-24 Thread Adrien Grand
Hi, On Sun, Jun 23, 2013 at 9:08 PM, Savia Beson wrote: > I think Mathias was talking about the case with many smallish fields that all > get read per document. DV approach would mean seeking N times, while stored > fields, only once? Or you meant he should encode all his fields into single

Re: how to fetch qyery feild + other feild related results in Lucene3.6

2013-06-24 Thread Roberto Ragusa
On 06/24/2013 08:38 AM, neeraj shah wrote: > My hit size is 127674 and even if i comment the remedy fetching code ( the > second search in for loop) still its taking very long time. > This is the code which im using without Remedy fetching code : > > >for(int k=0;k Docume

Re: how to fetch qyery feild + other feild related results in Lucene3.6

2013-06-24 Thread neeraj shah
so how can i solve this and reduce time? On Mon, Jun 24, 2013 at 2:21 PM, Roberto Ragusa wrote: > On 06/24/2013 08:38 AM, neeraj shah wrote: > > My hit size is 127674 and even if i comment the remedy fetching code ( > the > > second search in for loop) still its taking very long time. > > This i

Re: how to fetch qyery feild + other feild related results in Lucene3.6

2013-06-24 Thread neeraj shah
this is the way im indexing the file: FileInputStream fr = new FileInputStream(file); BufferedInputStream bfr = new BufferedInputStream(fr); DataInputStream dbfr = new DataInputStream(bfr); while(dbfr.available()!=0){ String line = dbfr.readLine(); if(line!=null){

DocIDBitSets & Grouping

2013-06-24 Thread Arun Kumar K
Hi Guys, I am using Lucene 4.2. 1> For my use case i am doing a search say name:xyz* and then i have a need to do a grouping with (from query same as name:xyz* + Filter + GroupSort) may be in same/different thread. >From my understanding the second internal search will be faster but i have good

RE: DocIDBitSets & Grouping

2013-06-24 Thread Uwe Schindler
Hi, > With prior warming i find that (a) & (b) take almost same time. I knew that > only when we reuse the Filter we get its benefits. > (c) takes around 30-40ms less time. > > Can we conclude from this that method (c) is better ? > Is my choice Bitset implementation appropriate ? Use FixedBitS

Re: DocIDBitSets & Grouping

2013-06-24 Thread Arun Kumar K
Thanks Uwe ! For part (1) of my query are there any smart ways ? Arun On Mon, Jun 24, 2013 at 4:29 PM, Uwe Schindler wrote: > Hi, > > > > With prior warming i find that (a) & (b) take almost same time. I knew > that > > only when we reuse the Filter we get its benefits. > > (c) takes around 30

Re: Stored fields: decompression slows down in my scenario ... any idea for a workaround?

2013-06-24 Thread Mathias Lux
Hi! I'm basically in the midth of experiments. The idea with the BinaryDocValuesField worked great, it's blazing fast ;) Reading 49,904 documents, each with a 64 byte value got down to 33 ms from 774 (with StoredField). Also writing is much faster 9ms vs. 22ms. Still, I've read that all the Binar

Re: Stored fields: decompression slows down in my scenario ... any idea for a workaround?

2013-06-24 Thread Adrien Grand
Hi, On Mon, Jun 24, 2013 at 2:47 PM, Mathias Lux wrote: > Still, I've read that all the BinaryDocValues go directly to memory. > Am I right with this? It is true that the current default implementation stores them in memory. However, disk doc values formats can be configured on a per-field basis

Re: Stored fields: decompression slows down in my scenario ... any idea for a workaround?

2013-06-24 Thread Mathias Lux
Hi! Thanks again for all the help. Seems like the field compression allows a huge step forward for my case. Here's some benchmarking for those of you interested: a document is * a StringField giving the actual image path * a single 64 byte feature (global OpponentHistogram) number of documents