It's really hard to say anything meaningful here. How many fields? Whatkind of sorting to you intend to do? How complex are the queries you expect?
And even if you have meaningful answers to the above, then "it depends" (tm). Then you could go to SOLR (which is built on Lucene) to handle distributed searching and a host of other infrastructure issues. There are certainly Lucene installations out there that are much larger than you're considering if that helps. But you can create a small test app *very* quickly that'll help you answer this for your local set of conditions, which might be a good place to start. Don't forget the "powered by" section of the Wiki for some ideas: http://wiki.apache.org/lucene-java/PoweredBy <http://wiki.apache.org/lucene-java/PoweredBy>Best Erick On Thu, Sep 24, 2009 at 11:17 AM, Joel Halbert <j...@su3analytics.com>wrote: > Hi, > > Does anyone know of any recent metrics & stats on building out an index > of ~100mm documents (each doc approx 5k). I'm looking for approx stats > on time to build, time to query and infrastructure requirements (number > of machines & spec) to reasonably support an index of such a size. > > Thanks, > Joel > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > >