many index reader problem

2012-07-15 Thread 齐保元
Hi,buddy: I have a problem concerning index readers:there are many small index/searcher instances in my application which are hold by a map.when new index request or seach request comes,I process them and return the result.The problem is,when the number of small index beco

RE: In memory Lucene configuration

2012-07-15 Thread Doron Yaacoby
I haven't tried that yet, but it's an option. The reason I'm waiting on this is that I am expecting many concurrent requests to my application anyway, so having multiple search threads per request might not be the best idea in production. -Original Message- From: Vitaly Funstein [mailto

RE: In memory Lucene configuration

2012-07-15 Thread Doron Yaacoby
Another interesting fact I just found out. Up until now I measured query execution time via my application. Meaning, the application would log each query it sends to Lucene and the time it takes to run it. The nature of my application is that there will be a variable number of lucene queries per

Re: In memory Lucene configuration

2012-07-15 Thread Vitaly Funstein
Have you tried sharding your data? Since you have a fast multi-core box, why not split your indices N-ways, say the smaller one into 4, and the larger into 8. Then you can have a pool of dedicated search threads, executing the same query against separate physical indices within each "logical" one i

Re: about some date store

2012-07-15 Thread sam
I had done that,I used the docment.add(new field("content",content,field.store.yes,filed.analyzer.yes));i have used while loop to set the content like while((str=reader.readline)!=null) ,But when i used document.get("content"),i can only get the first LIne. -- View this message in context: http:/

Re: from docID to terms enumerator in O(1) ?

2012-07-15 Thread Giovanni Gherdovich
2012/7/15 Uwe Schindler : > Enable term vectors while indexing and use the TermVector API. > Thank you very much Uwe! I just got back to chapter 2 of "Lucene in Action", where it says "If it’s indexed, the field may also optionally store term vectors, which are collectively a miniature inverted

RE: from docID to terms enumerator in O(1) ?

2012-07-15 Thread Uwe Schindler
Enable term vectors while indexing and use the TermVector API. - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -Original Message- > From: Giovanni Gherdovich [mailto:g.gherdov...@gmail.com] > Sent: Sunday, July 15, 2012 5:57 PM >

from docID to terms enumerator in O(1) ?

2012-07-15 Thread Giovanni Gherdovich
Hi all, I'd like to know if I can get the list of indexed terms in a document from its document ID in constant time (say, in a time independent of the size of the index). The reason why I ask might be relevant (you could suggest me a totally different way to achieve my goal). I want to present t

RE: In memory Lucene configuration

2012-07-15 Thread Doron Yaacoby
Thanks for the quick input! I ran a few more tests with your suggested configuration (-Xmx1G -Xms1G with MMapDirectory). At the third time I ran the same test I finally got an improvement - an average of ~30ms per query, although it's still not as fast as I need it to be. The test contains abou

Re: In memory Lucene configuration

2012-07-15 Thread Simon Willnauer
hey there, On Sun, Jul 15, 2012 at 10:41 AM, Doron Yaacoby wrote: > Hi, I have the following situation: > > I have two pretty large indices. One consists of about 1 billion documents > (takes ~6GB on disk) and the other has about 2 billion documents (~10GB on > disk). The documents are very sho

In memory Lucene configuration

2012-07-15 Thread Doron Yaacoby
Hi, I have the following situation: I have two pretty large indices. One consists of about 1 billion documents (takes ~6GB on disk) and the other has about 2 billion documents (~10GB on disk). The documents are very short (4-5 terms each in the text field, and one numeric field with a long valu