I'm aware of a single index with the following characteristics:

Single index  size = 33.2GB
Documents: 263 million
Searchable fields = 7 
Query Response times: <1 second for a single term search
Anything from 5-20 seconds for more complex searches (e.g fuzzy matching on 
multiple fields)

This is an extreme example and we normally use a distributed architecture with 
multiple indexes/servers to deal with data on this scale (as Andrzej also 
suggests in his mail).

Cheers,
Mark


----- Original Message ----
From: Andrzej Bialecki <[EMAIL PROTECTED]>
To: java-user@lucene.apache.org
Sent: Friday, 26 January, 2007 8:32:01 AM
Subject: Re: How many documents in the biggest Lucene index to date?

Bill Taylor wrote:
> I have used Lucene to index a small collection - only a few hundred 
> documents.  I have a potential client who wants to index a collection 
> which will start at about a million documents and could easily grow to 
> two million.
>
> Has anyone used Lucene with an index that large?

I'm working on a regular basis with indexes containing several million 
documents. Somewhere around 10 mln documents, depending on the hardware 
and the type of queries, you will need to consider splitting the index 
into parts distributed over several machines - but below that size you 
should be able to get sub-second responses to typical queries.

-- 
Best regards,
Andrzej Bialecki     <><
 ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]






                
___________________________________________________________ 
All New Yahoo! Mail – Tired of unwanted email come-ons? Let our SpamGuard 
protect you. http://uk.docs.yahoo.com/nowyoucan.html

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to