thanks for the reply. I'm indexing emails. Fields are the common attribute on emails: subject, content, attachment, message size, date, sender, recipients, etc. The index is a few GB. Is there a good practice to keep the index file size at a certain level? when I do a search on the date field that should retrieve a lot of records, it normally throws the exception. I will look at MultiSearcher. do you think split the index file based on date field is a good choice? I somehow feel it requires a lot of coding to create many indexes based on date field. Thanks, Jeff
-----Original Message----- From: Dan Funk Sent: Fri 12/16/2005 7:12 PM To: java-user@lucene.apache.org Cc: Subject: Re: best strategy to deal with large index file Are there specific queries that cause the out of memory problem? Or will any query do it? How large is the index? MultiSearcher allows you to search over multiple indexes, and is well supported throughout the API. How you split your indexes is depends on what you want to achieve. There are many here on the list developing indexes for large data sets. Please be a little more specific on what you are indexing, and how you are searching. On 12/16/05, Jeff Liang <[EMAIL PROTECTED]> wrote: > > Hi all, > > my index file is huge because of large set of data. when I do search, I > get outofmemory exception sometime. I don't know what's usually causing > the outofmemory exception. Is it during the search > because of the index file is too big? or because there are too many > hits? memory exception just happens when I call > IndexSearcher.search(). > it's also bad for backup because I can't do incremental backup after > adding new documents since I only have one big index file. > > What's the best strategy to deal with large index file? what's a good > way to split the index file? > I start jvm with 800MB. > thanks, > > Jeff > > > >
--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]