So you're saying searches are taking 10 seconds on a 5G index? If so that
seems ungodly slow.
If you're on *nix, have you watched your iostat statistics? Maybe something
is hammering your hds.
Something seems amiss.

What lucene methods were pointed to as hotspots by YourKit?

-M

On Tue, Feb 26, 2008 at 2:13 PM, Jamie <[EMAIL PROTECTED]> wrote:

> Hi Michael
>
> Perhaps this will help. We are using Lucene to index emails and provide
> a search interface to search through those emails. Many of our customers
> have 3-5 TB's or more of email data. The index size tends to be around 5
> GB per million messages. On a 3 GHZ intel core duo with standard 7200 mb
> drive, it takes approx. 10 seconds to search across a million emails. We
> need sub second search times, especially since, as time progresses, some
> of our archives are expected to reach 10-20 TB of data. In future, we
> will be recommending the use of SSD drives, but I'd like to know if they
> are any other strategies can pursued. One such strategy is to
> automatically create a new index after the index gets to a certain size.
> Then, when a search is conducted, based on date, search only those
> indexes that fall between specified dates. I've run my code through the
> YourKit profiler. The time appears to be consumed by Lucene itself and
> not by my code.
>
> Any other ideas?
>
>
> Michael Stoppelman wrote:
> > On Tue, Feb 26, 2008 at 10:18 AM, Jamie <[EMAIL PROTECTED]> wrote:
> >
> >
> >> Hi
> >>
> >> I am looking for a way to improve the search performance of my
> >> application. I've followed every suggestion in the Lucene Wiki but the
> >> search is still too slow with large indexes. I was wondering whether
> >>
> >
> >
> > Did you optimize your index yet? That gave me a 2x bump.
> >
> > Have you put timers around parts of your code? Maybe it's something
> > unrelated to lucene.
> > You should probably give more details on your setup if you want more
> helpful
> > advice.
> >
> >
> >
> >> there was a way to restrict a search to a specific time period and in
> >> doing so sacrifice the quality of search results? Any other suggestions
> >> on how to improve search performance?
> >>
> >> Much appreciate
> >>
> >> Jamie
> >>
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: [EMAIL PROTECTED]
> >> For additional commands, e-mail: [EMAIL PROTECTED]
> >>
> >>
> >>
> >
> >
>
>
> --
> Stimulus Software - MailArchiva
> Email Archiving And Compliance
> USA Tel: +1-713-366-8072 ext 3
> UK Tel: +44-20-80991035 ext 3
> Email: [EMAIL PROTECTED]
> Web: http://www.mailarchiva.com
>
> To receive MailArchiva Enterprise Edition product announcements, send a
> message to: <[EMAIL PROTECTED]>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>
>

Reply via email to