Thank you, Karl! --- On Fri, 4/9/10, Karl Wettin <karl.wet...@gmail.com> wrote:
> From: Karl Wettin <karl.wet...@gmail.com> > Subject: Re: Lucene Partition Size > To: java-user@lucene.apache.org > Date: Friday, April 9, 2010, 9:39 AM > It's hard for me to say why this is > slow. > > Here are a few more questions whose anwers might provide > further clues: > > What was the reasons that led you to partition the index > this way? > What does the searcher implementation look like? > What would a typcial query sent to that searcher look > like? > > I would start by loading the full 400Gb as a single index > on local disc. > > > karl > > 8 apr 2010 kl. 22.07 skrev Ivan Provalov: > > > Karl, > > > > We have not done the same scale local-disk test. > Our network > > parameters are > > > > - Network speed: 1gb > > - 3 partitions per volume > > - The volumes are accessed via NFS to EMC Celera > devices. (NFS 3) > > - The drives are 300 gb fiber attached with > 10,000 rpm. > > > > Thanks, > > > > Ivan > > > > --- On Thu, 4/8/10, Karl Wettin <karl.wet...@gmail.com> > wrote: > > > >> From: Karl Wettin <karl.wet...@gmail.com> > >> Subject: Re: Lucene Partition Size > >> To: java-user@lucene.apache.org > >> Date: Thursday, April 8, 2010, 2:44 PM > >> > >> 8 apr 2010 kl. 20.05 skrev Ivan Provalov: > >> > >>> We are using Lucene for searching of 200+ mln > >> documents (periodical publications). Is > there any > >> limitation on the size of the Lucene index (file > size, > >> number of docs, etc...)? > >> > >> The only such limitation in Lucene I'm aware of > is > >> Integer.MAX_VALUE documents. This might also be > true for > >> number of terms. > >> > >>> We are partitioning the indexes at about 10 > mln > >> documents per partition (each partition is on a > separate > >> box, some mounted volumes are shared among > three-four > >> partitions). We have index size around 10% > of the > >> content size. The total content is around > 4Tb and the > >> index around 400Gb. This content is planned > to be > >> split into 20 partitions (10mln docs, 200Gb > content size, > >> 20Gb index size). We are using a memory > mapped index > >> directory implementation. Our testing is > done with 600 > >> concurrent users. > >>> > >>> We are seeing consistently high response times > from > >> the partitions (4-5 seconds). Is there a > number of > >> documents per partition limitation in Lucene for > this > >> particular scenario? > >> > >> I'm not sure if I got this right but it sounds > like your > >> index is mounted over network? Can you tell us > some more > >> details about that? What speeds do you see if you > put the > >> index on local disc? > >> > >> > >> > >> karl > >> > >> > --------------------------------------------------------------------- > >> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > >> For additional commands, e-mail: java-user-h...@lucene.apache.org > >> > >> > > > > > > > > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > > For additional commands, e-mail: java-user-h...@lucene.apache.org > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org