I am sorry, it seems that I was not clear with what my problem is. I will try to describe it again.
My data is divided into 40 categories and at one time only one category can be searched. The GUI for the system will ask the user to select the category from a drop-down. Currently, I have a separate index for every category. The index sizes varies - one category index is 10MB and another is 700MB. Other index-sizes are somewhere in between. I was wondering if it will be better to just have 1 large index with all the 40 indices combined. I do not need to do dual-queries and my total index size (if I create a single index) is about 3.4GB. It will increase to maximum of 5-6 GB. I am running this on a dedicated machine with 8GB RAM. Unfortunately I do not have enough hardware to run both in parallel and test properly. Have just one server which is being used by live users. So it would be great if you could tell me whether I should stick with my 40 indices or combine them into 1 index. What are the pros and cons of each approach ? Thanks, Nikhil ----- Original Message ---- From: Grant Ingersoll <[EMAIL PROTECTED]> To: java-user@lucene.apache.org Sent: Thursday, 20 September, 2007 7:57:21 PM Subject: Re: Multiple Indices vs Single Index If I understand correctly, you want to do a two stage retrieval right? That is, look up in the initial index (3.4 GB) and then do a second search on the sub index? Presumably, you have to manage the Searchers, etc. for each of the sub-indexes as well as the big index. This means you have to go through the hits from the first search, then route, etc. correct? Have you tried creating one single index with all the (stored) fields, etc? Worst case scenario, assuming 1GB per index, is you would have a 40GB index, but my guess is index compression will reduce it more. Since you are less than that anyway, have you tried just the straightforward solution? Or do you have other requirements that force the sub-index solution? Also, I am not sure it will work, but it seems worth a try. Of course, this also depends on how much you expect your indexes to grow. Also, what was inconclusive about your tests? Maybe you can describe more what you have tried to date? Cheers, Grant On Sep 20, 2007, at 3:50 AM, Nikhil Chhaochharia wrote: > Hi, > > I have about 40 indices which range in size from 10MB to 700MB. > There are quite a few stored fields. To get an idea of the > document size, I have about 400k documents in the 700MB index. > > Depending on the query, I choose the index which needs to be > searched. Each query hits only one index. I was wondering if > creating a single index where every document will have the > indexname as a field will be more efficient. I created such an > index and it was 3.4 GB in size. My initial performance tests with > it are not conclusive. > > Also, what are the other points to be addressed while deciding > between 1 index and 40 indices. > > I have 8GB RAM on the machine. > > > Thanks, > Nikhil > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > -------------------------- Grant Ingersoll http://lucene.grantingersoll.com Lucene Helpful Hints: http://wiki.apache.org/lucene-java/BasicsOfPerformance http://wiki.apache.org/lucene-java/LuceneFAQ --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]