I guess it would be quite different for different apps. For me, I do index update on a single machine: index each incoming documents into one chunk according to some rule to ensure even distribution. Then copy all the updated indexes to some other machines for searching. Each machine will then reopen the updated index.
For searching you can look at RemoteSearchable + ParallelSearcher. But if you need redundancy / failover, etc, you will probably need to do it yourself. Cedric On Feb 11, 2008 11:14 AM, Briggs <[EMAIL PROTECTED]> wrote: > So, I have a question about 'splitting indexes'. I see people say > this all over, but how have people been handling this. I'm going to > start a new thread, and there probably was one back in the day, but I > am going to fire it up again. But, how did you do it? > > > On Feb 10, 2008 9:18 PM, Cedric Ho <[EMAIL PROTECTED]> wrote: > > Is it a single index ? My index is also in the 200G range, but I never > > managed to get > > a single index of size > 20G and still get acceptable performance (in > > both searching and updating). > > So I split my indexes into chunks of < 10G > > > > I am curious as to how you manage such a single large index. > > > > Cedric > > > > > > > > > > On Feb 8, 2008 11:51 PM, <[EMAIL PROTECTED]> wrote: > > > Hi, > > > > > > > > > > > > I have a large index which is around 275GB. As I search different parts > > > of the index, the memory footprint grows with large byte arrays being > > > stored. They never seem to get unloaded or GC'ed. Is there any way to > > > control this behavior so that I can periodically unload cached > > > information? > > > > > > > > > > > > The nature of the data being indexed doesn't allow me to reduce the > > > number of terms per field, although I might be able to reduce the number > > > of overall fields (I have some which aren't currently being searched > > > by). > > > > > > > > > > > > I've just begun investigating and profiling the problem, so I don't have > > > a lot of details at this time. Any support would be extremely welcome. > > > > > > > > > > > > Thanks, > > > > > > > > > > > > Marc Dumontier > > > Manager, Software Development > > > Thomson Scientific (Canada) > > > 1 Yonge Street, Suite 1801 > > > Toronto, Ontario M5E 1W7 > > > > > > > > > > > > Direct +1 416 214 3448 > > > Mobile +1 416 454 3147 > > > > > > > > > > > > > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: [EMAIL PROTECTED] > > For additional commands, e-mail: [EMAIL PROTECTED] > > > > > > > > -- > "Conscious decisions by conscious minds are what make reality real" > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]