Hi Joel, Couple of quick points.
1. The metric for indexing only. 2. It is 1000 docs/minute (sorry for the earlier 1000/sec goof up) 3. Regarding search/query, it depends on many parameters... (similarity, proximity, synonym look up etc.) Sincerely, Sithu D Sudarsan -----Original Message----- From: Sudarsan, Sithu D. [mailto:sithu.sudar...@fda.hhs.gov] Sent: Thursday, September 24, 2009 1:11 PM To: java-user@lucene.apache.org Subject: RE: metrics for index ~100M docs Hi Joel, With approx. 100K doc size, on dual-quad core machine, (3.0Ghz) - Windows platform, we have an average 1000 docs/sec. This includes text extraction from PDF docs. Hope this helps. Sincerely, Sithu D Sudarsan -----Original Message----- From: Joel Halbert [mailto:j...@su3analytics.com] Sent: Thursday, September 24, 2009 11:17 AM To: Lucene Users Subject: metrics for index ~100M docs Hi, Does anyone know of any recent metrics & stats on building out an index of ~100mm documents (each doc approx 5k). I'm looking for approx stats on time to build, time to query and infrastructure requirements (number of machines & spec) to reasonably support an index of such a size. Thanks, Joel --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org