Hi Guerre,

The reason you haven't received any answers yet is because this is pretty 
impossible the answer.....and so I'll try to answer your questions now, at 3:40 
AM. ;)

----- Original Message ----
From: Guerre Bear <[EMAIL PROTECTED]>
To: java-user@lucene.apache.org
Sent: Wednesday, October 18, 2006 3:14:00 PM
Subject: Scalability Questions

Hello All,

Lucene looks very interesting to me.  I was wondering if any of you could
comment on a few questions:

1) Assuming I use a typical server such as a dual-core dual-processor Dell
2950, about how many files can Lucene index and still have a sub-two-second
search speed for a simple search string such as "invoice 2005 mitsubishi"?
For the sake of argument, I figure that a typical file will have about 30KB
of text in it.

Impossible to answer.  It would depend whether your index is in RAM (cached) or 
not, how many queries are hitting it symultaneously, how complex the queries 
are, how deep in search results you are going, etc.

2) How many of these servers would it take to manage an index of one billion
such files?

Hm, 9 zeros.... my semi-wild guess is that this would require 100+ search 
servers.

3) Are there any HOWTO's on constructing a large Lucene search cluster?

Nope.  People have done it, I've done some of it, but there is no good and 
publicly available information.  This would make a good case study for Lucene 
in Action 2, but my guess is that no company would want to share their secrets.

4) Roughly how large is the index file in comparison to the size of the
input files?

It depends on whether you store fields or just index them, plus there is also a 
compression (gzip -9 equivalent) option.

5) How does Lucene's search performance/scalability compare to some of the
expensive commercial search products such as Fast?  (www.fastsearch.com)

I can't go into details here, but I have information from a few _very_ reliable 
sources that you should think 2-3-4-5.... times before going with FAST.  Sorry 
I can't be more specific.

Otis





---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to