Combining results of multiple indexes

2008-12-16 Thread Preetham Kajekar
Hi, I am new to Lucene. I am not using it as a pure text indexer. I am trying to index a Java object which has about 10 fields (like id, time, srcIp, dstIp) - most of them being numerical values. In order to speed up indexing, I figured that having two separate indexers, each of them indexing d

Re: Combining results of multiple indexes

2008-12-17 Thread Preetham Kajekar
Hi Grant, Thanks four response. Replies inline. Grant Ingersoll wrote: On Dec 17, 2008, at 12:57 AM, Preetham Kajekar wrote: Hi, I am new to Lucene. I am not using it as a pure text indexer. I am trying to index a Java object which has about 10 fields (like id, time, srcIp, dstIp) - most

Re: Combining results of multiple indexes

2008-12-17 Thread Preetham Kajekar
} Thanks for the support. ~preetham Best Erick On Wed, Dec 17, 2008 at 9:40 AM, Preetham Kajekar wrote: Hi Grant, Thanks four response. Replies inline. Grant Ingersoll wrote: On Dec 17, 2008, at 12:57 AM, Preetham Kajekar wrote: Hi, I am new to Lucene. I am no

Re: Combining results of multiple indexes

2008-12-17 Thread Preetham Kajekar
h to make that reasonable. Combining indexes may take a while though. Best Erick On Wed, Dec 17, 2008 at 10:46 AM, Preetham Kajekar wrote: Hi Erick, Thanks for the response. Replies inline. Erick Erickson wrote: The very first question is always "are you opening a new searcher ea

Re: Combining results of multiple indexes

2008-12-18 Thread Preetham Kajekar
All Fields (9) using 1 IndexWriter 2 Thread - 29,000 object per sec All Fields (9) using 2 IndexWriter 2 Thread - 55,000 object per sec So, it looks like I will have figure how to combine results of multiple indexes. Thanks, ~preetham Preetham Kajekar wrote: Thanks Erick and Michael. I will

Re: Combining results of multiple indexes

2008-12-18 Thread Preetham Kajekar
dozen lines (and that only if you are merging 6 or so indexes) See IndexWriter.addIndexes or IndexWriter.addIndexesNoOptimize Best Erick On Thu, Dec 18, 2008 at 5:03 AM, Preetham Kajekar wrote: Hi, I tried out a single IndexWriter used by two threads to index different fields. It is

Re: Combining results of multiple indexes

2008-12-18 Thread Preetham Kajekar
something undocumented of Lucene. Thanks, ~preetham Preetham Kajekar wrote: Thanks. Yep the code is very easy. However, it take about 3 mins to complete merging. Looks like I will need to have an out of band merging of indexes once they are closed (planning to store about 50mil entries in each index

Re: Combining results of multiple indexes

2009-01-22 Thread Preetham Kajekar
the number of CPU's. So while querying, I will use all these indexes to get matches. What do you think about this ? Will querying etc be considerable slower ? Thanks, ~preetham Preetham Kajekar wrote: Hi, I noticed that the doc id is the same. So, if I have HitCollector, just collect the do

MultiSearcher query with Sort option

2009-04-10 Thread Preetham Kajekar
Hi, I am using a MultiSearcher to search 2 indexes. As part of my query, I am sorting the results based on a field (which in NOT_ANALYSED). However, i seem to be getting hits only from one of the indexes. If I change to Sort.INDEX_ORDER, I seem to be getting results from both. Is this a know p

Re: MultiSearcher query with Sort option

2009-04-10 Thread Preetham Kajekar
- From: Preetham Kajekar [mailto:preet...@cisco.com] Sent: Friday, April 10, 2009 11:27 AM To: java-user@lucene.apache.org Subject: Re: MultiSearcher query with Sort option Hi, I just realized it was a bug in my code. On a related note, is it possible to Sort based on reverse index order ? Thanks

Re: MultiSearcher query with Sort option

2009-04-10 Thread Preetham Kajekar
is wrong. I always recommend to only use MultiSearcher in distributed or parallel search scenarios, never for just combining two indexes. Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de -Original Message- From: Preetham

Re: MultiSearcher query with Sort option

2009-04-10 Thread Preetham Kajekar
Hi, I found the API in another post on the net. new *Sort*(new SortField(null, SortField.DOC, true)) The trick is to set the field to null. Thanks for the help. Preetham Kajekar wrote: Hi Uwe, Thanks for your response. However, I could not find the API in SortField and Sort to achieve this

Getting Top n term for a given field for a given time period

2009-04-21 Thread Preetham Kajekar
Hi, I have a lucene index which has 20 mil documents. Each document has a timestamp field and a source field. I am interested in finding the top n sources for a given hour (based on the timestamp). I know we can get the top n sources fields easily using the IndexReader API, but was wondering

Query rewriting/optimization

2009-05-21 Thread Preetham Kajekar
Hi, I am wondering if Lucene internally rewrites/optimizes Query. I am programatically generating Query based on various user options, and quite often I have BooleanQueri'es wrapped inside BooleanQueries etc. Like, ((Src:Testing Dst:Test) (Src:Test2 Port:http)). In this case, would Lucene optim

Re: Query rewriting/optimization

2009-05-21 Thread Preetham Kajekar
hould test adding it as a clause on BooleanQuery instead of passing in the"Filter" arge to search (we are considering doing that internally). If you do some testing and learn anything interesting, please post back! Mike On Thu, May 21, 2009 at 1:06 PM, Preetham Kajekar wrote: Hi, I am

Re: Most frequently indexed term

2009-05-26 Thread Preetham Kajekar
Have a look at http://stackoverflow.com/questions/195434/how-can-i-get-top-terms-for-a-subset-of-documents-in-a-lucene-index (I have not tried the above out) Ganesh wrote: Hello All, I need to build some stats. I need to know Top 5 frequently indexed term in a date range (In a day or a Month)

Re: Top N Phrases in subset of documents

2009-05-27 Thread Preetham Kajekar
http://stackoverflow.com/questions/195434/how-can-i-get-top-terms-for-a-subset-of-documents-in-a-lucene-index tomm...@aim.com wrote: Hi All, I need to determine top words/phrases in my documents, and?currently using the ShingleAnalyzerWrapper for indexing. Through Luke it seems the top terms

RE: Combining results of multiple indexes

2008-12-22 Thread Preetham Kajekar (preetham)
x and then skipped it in the other, all subsequent document IDs would not match. If. The fact that your IDs are the same is more than undocumented, it is coincidental. Best Erick On Thu, Dec 18, 2008 at 11:46 AM, Preetham Kajekar wrote: > Hi, > I noticed that the doc id is the same. So