Re: DocIDBitSets & Grouping

2013-06-24 Thread Arun Kumar K
Thanks Uwe ! For part (1) of my query are there any smart ways ? Arun On Mon, Jun 24, 2013 at 4:29 PM, Uwe Schindler wrote: > Hi, > > > > With prior warming i find that (a) & (b) take almost same time. I knew > that > > only when we reuse the Filter we get its benefits. > > (c) takes around 30

DocIDBitSets & Grouping

2013-06-24 Thread Arun Kumar K
Hi Guys, I am using Lucene 4.2. 1> For my use case i am doing a search say name:xyz* and then i have a need to do a grouping with (from query same as name:xyz* + Filter + GroupSort) may be in same/different thread. >From my understanding the second internal search will be faster but i have good

Re: FieldCache & DocValues Filter

2013-06-06 Thread Arun Kumar K
Hi, Thanks Robert ! This info is exactly what i need. Just for getting myself clear. If the field is a DocValue field the FieldCacheTermsFilter will use the existing DocValues Field. For Normal Fields the filter will create a DocValues for that field using FieldCache. Arun On Thu, Jun 6, 2013

FieldCache & DocValues Filter

2013-06-06 Thread Arun Kumar K
Hi Guys, I was trying to better the filtering mechanism for my use case. When i use the existing filters like FieldCacheTermsFilter, TermsFilter i see that the first filtering take up enough time may be for building the FieldCache. Subsequent filters are fast enough. Currently, I am using CachingW

Re: Lucene 4.2 Doc Vals

2013-06-04 Thread Arun Kumar K
inefficient for random lookup. You schould do a bibary > search to find the right leaf. ComposuteReader and ReaderUtil have utility > methods to do this. > > Uwe > > > > Arun Kumar K schrieb: > >Hi Guys, > > > >I am trying to get hands on Lucene 4.2 Doc Va

Lucene 4.2 Doc Vals

2013-06-04 Thread Arun Kumar K
Hi Guys, I am trying to get hands on Lucene 4.2 Doc Values (RAM Based Which is by default). I have a 1GB index with 54 documents. When retrieving the DocVals for matched docs i am able to retrieve vals only upto some limit around 45000 docvals only. for (AtomicReaderContext context : reader

Re: Lucene 4.2 DocValues

2013-05-28 Thread Arun Kumar K
Adrein, Thanks for spending time to explain me the things clearly. I have got the things correctly now. Thanks, Arun On 29-May-2013, at 2:13 AM, Adrien Grand wrote: > On Tue, May 28, 2013 at 8:55 PM, Arun Kumar K wrote: >> Thanks for clarifying the things. >> I have some d

Re: Lucene 4.2 DocValues

2013-05-28 Thread Arun Kumar K
i right here? Thanks, Arun On 28-May-2013, at 8:31 PM, Adrien Grand wrote: > On Tue, May 28, 2013 at 4:48 PM, Arun Kumar K wrote: >> Hi Guys, > > Hi, > >> I have been trying to understand DocValues and get some hands on and have >> observed few things. &g

Lucene 4.2 DocValues

2013-05-28 Thread Arun Kumar K
Hi Guys, I have been trying to understand DocValues and get some hands on and have observed few things. I have added LongDocValuesField to the documents like: doc.add(new LongDocValuesField("id",1)); 1> In 4.0 i saw that there are two versions for docvalues, RAM Resident(using Sources.getSO

Re: WildCardQuery: TooManyClauses Exception

2013-04-18 Thread Arun Kumar K
gt; - > Uwe Schindler > H.-H.-Meier-Allee 63, D-28213 Bremen > http://www.thetaphi.de > eMail: u...@thetaphi.de > > > > -----Original Message- > > From: Arun Kumar K [mailto:arunk...@gmail.com] > > Sent: Thursday, April 18, 2013 12:41 PM > > To: java-

WildCardQuery: TooManyClauses Exception

2013-04-18 Thread Arun Kumar K
Hi Guys, I am using following queries: 1>WildCardQuery 2>BooleanQuery having a WildCardQuery and TermQuery. WildCardQuery is field:* or say field:ab* >From Lucene FAQs and earlier discussions about TooManyClausesException i see that WildCardQuery gets expanded before doing search. For that i was

Re: "4.1 consuming more memory than 3.0.2 while Indexing"

2013-04-01 Thread Arun Kumar K
ram you are setting on the > index writer config? > > also how many threads are you using for indexing? > > simon > > On Mon, Apr 1, 2013 at 2:21 PM, Arun Kumar K wrote: > > Hi Adrien, > > > > I have seen memory usage using linux command top for RES memory &

Re: "4.1 consuming more memory than 3.0.2 while Indexing"

2013-04-01 Thread Arun Kumar K
M, Adrien Grand wrote: > On Mon, Apr 1, 2013 at 1:56 PM, Arun Kumar K wrote: > > Hi Guys, > > Hi, > > > I have been finding out the heap space requirement for indexing and > > searching with 3.0.2 vs 4.1 (with BlockPostings Format). > > > > I have a 2GB inde

"4.1 consuming more memory than 3.0.2 while Indexing"

2013-04-01 Thread Arun Kumar K
Hi Guys, I have been finding out the heap space requirement for indexing and searching with 3.0.2 vs 4.1 (with BlockPostings Format). I have a 2GB index with 1 million docs with around 42 fields with 40 fields being random strings. I have seen that memory for search has reduced by 5X with 4.1 (w

Re: Wild Card Query Performance

2013-03-29 Thread Arun Kumar K
ngeQuery(20130101, 20130131)? Another approach for improving > prefix queries is indexing additional terms: If you are always searching > for a 2-char prefix for "ab*", then simply index an additional term in a > separate field with 2 chars (e.g., "ab") in your documents

Re: Wild Card Query Performance

2013-03-29 Thread Arun Kumar K
dler > H.-H.-Meier-Allee 63, D-28213 Bremen > http://www.thetaphi.de > eMail: u...@thetaphi.de > > > > -Original Message- > > From: Arun Kumar K [mailto:arunk...@gmail.com] > > Sent: Friday, March 29, 2013 10:38 AM > > To: java-user > > Subject: Wi

Wild Card Query Performance

2013-03-29 Thread Arun Kumar K
Hi Guys, I have been testing the search time improvement in Lucene 4.0 from Lucene 3.0.2 version for Wildcard Queries (with atleast say 2 chars Eg.ar*). For a 2GB size index with 400 docs, the following observations were made: Around 3X improvement with and without STRING sort on a sortable