So you're saying searches are taking 10 seconds on a 5G index? If so that seems ungodly slow. If you're on *nix, have you watched your iostat statistics? Maybe something is hammering your hds. Something seems amiss.
What lucene methods were pointed to as hotspots by YourKit? -M On Tue, Feb 26, 2008 at 2:13 PM, Jamie <[EMAIL PROTECTED]> wrote: > Hi Michael > > Perhaps this will help. We are using Lucene to index emails and provide > a search interface to search through those emails. Many of our customers > have 3-5 TB's or more of email data. The index size tends to be around 5 > GB per million messages. On a 3 GHZ intel core duo with standard 7200 mb > drive, it takes approx. 10 seconds to search across a million emails. We > need sub second search times, especially since, as time progresses, some > of our archives are expected to reach 10-20 TB of data. In future, we > will be recommending the use of SSD drives, but I'd like to know if they > are any other strategies can pursued. One such strategy is to > automatically create a new index after the index gets to a certain size. > Then, when a search is conducted, based on date, search only those > indexes that fall between specified dates. I've run my code through the > YourKit profiler. The time appears to be consumed by Lucene itself and > not by my code. > > Any other ideas? > > > Michael Stoppelman wrote: > > On Tue, Feb 26, 2008 at 10:18 AM, Jamie <[EMAIL PROTECTED]> wrote: > > > > > >> Hi > >> > >> I am looking for a way to improve the search performance of my > >> application. I've followed every suggestion in the Lucene Wiki but the > >> search is still too slow with large indexes. I was wondering whether > >> > > > > > > Did you optimize your index yet? That gave me a 2x bump. > > > > Have you put timers around parts of your code? Maybe it's something > > unrelated to lucene. > > You should probably give more details on your setup if you want more > helpful > > advice. > > > > > > > >> there was a way to restrict a search to a specific time period and in > >> doing so sacrifice the quality of search results? Any other suggestions > >> on how to improve search performance? > >> > >> Much appreciate > >> > >> Jamie > >> > >> > >> --------------------------------------------------------------------- > >> To unsubscribe, e-mail: [EMAIL PROTECTED] > >> For additional commands, e-mail: [EMAIL PROTECTED] > >> > >> > >> > > > > > > > -- > Stimulus Software - MailArchiva > Email Archiving And Compliance > USA Tel: +1-713-366-8072 ext 3 > UK Tel: +44-20-80991035 ext 3 > Email: [EMAIL PROTECTED] > Web: http://www.mailarchiva.com > > To receive MailArchiva Enterprise Edition product announcements, send a > message to: <[EMAIL PROTECTED]> > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > >