Re: Use multiple lucene indices

2011-12-07 Thread Francisco A. Lozano
I have that use-case too: lots of indexes and each request is handled by only one well-known index. For us it's working very well (but our indexes are *small*- 1k-10k entries). What we do is open/close the index reader / writer each time it's needed, and reuse it if two requests need to access the

Re: Use multiple lucene indices

2011-12-07 Thread Rui Wang
Hi Danil, Thank you for answering once again. You are right that we always know the file we are searching, the file location is stored in a database. Having done some testing, it seems to me that use index/file yields reasonable performance just like you suggested. For a 500K docs/index, I

Re: Use multiple lucene indices

2011-12-06 Thread Danil ŢORIN
10B documents is a lot of data. Index/file won't scale: you will not be able to open all the indexes in the same time (filehandlers limits, memory limits, etc), and if you'll search through them sequentially, it will take a lot of time. Unless in your usecase you always know the file you are sear

Re: Use multiple lucene indices

2011-12-06 Thread Rui Wang
Hi Danil, Thank you for your suggestions. We will have approximately half million documents per file, so using your calculation, 2 files * 50 = 10, 000, 000, 000. And we are likely to get more files in the future, so a scalable solution is most desirable. The document IDs are not uniq

Re: Use multiple lucene indices

2011-12-06 Thread Danil ŢORIN
How many documents there are in the system ? approximate it by: 2 files * avg(docs/file) >From my understanding your queries will be just lookup for a document ID (Q: are those IDs unique between files? or you need to filter by filename?) If that will be the only usecase than maybe you should

Re: Use multiple lucene indices

2011-12-06 Thread Rui Wang
Hi Guys, Thank you very much for your answers. I will do some profiling on memory usage, but is there any documentation on how Lucene uses/allocates the memory? Best wishes, Rui Wang On 6 Dec 2011, at 06:11, KARTHIK SHIVAKUMAR wrote: > hi > >>> would the memory usage go through the roof?

Re: Use multiple lucene indices

2011-12-05 Thread KARTHIK SHIVAKUMAR
hi >> would the memory usage go through the roof? Yup My past experience got me pickels in there... with regards karthik On Mon, Dec 5, 2011 at 11:28 PM, Rui Wang wrote: > Hi All, > > We are planning to use lucene in our project, but not entirely sure about > some of the design decis

Use multiple lucene indices

2011-12-05 Thread Rui Wang
Hi All, We are planning to use lucene in our project, but not entirely sure about some of the design decisions were made. Below are the details, any comments/suggestions are more than welcome. The requirements of the project are below: 1. We have tens of thousands of files, their size rangi