I think Jason meant "15-20GB segments"? Otis -- Sematext -- http://sematext.com/ -- Solr - Lucene - Nutch
________________________________ From: Jason Rutherglen <jason.rutherg...@gmail.com> To: java-user@lucene.apache.org Sent: Wed, January 13, 2010 5:54:38 PM Subject: Re: Max Segmentation Size when Optimizing Index Yes... You could hack LogMergePolicy to do something else. I use optimise(numsegments:5) regularly on 80GB indexes, that if optimized to 1 segment, would thrash the IO excessively. This works fine because 15-20GB indexes are plenty large and fast. On Wed, Jan 13, 2010 at 2:44 PM, Trin Chavalittumrong <mrt...@gmail.com> wrote: > Seems like optimize() only cares about final number of segments rather than > the size of the segment. Is it so? > > On Wed, Jan 13, 2010 at 2:35 PM, Jason Rutherglen < > jason.rutherg...@gmail.com> wrote: > >> There's a different method in LogMergePolicy that performs the >> optimize... Right, so normal merging uses the findMerges method, then >> there's a findMergeOptimize (method names could be inaccurate). >> >> On Wed, Jan 13, 2010 at 2:29 PM, Trin Chavalittumrong <mrt...@gmail.com> >> wrote: >> > Do you mean MergePolicy is only used during index time and will be >> ignored >> > by by the Optimize() process? >> > >> > >> > On Wed, Jan 13, 2010 at 1:57 PM, Jason Rutherglen < >> > jason.rutherg...@gmail.com> wrote: >> > >> >> Oh ok, you're asking about optimizing... I think that's a different >> >> algorithm inside LogMergePolicy. I think it ignores the maxMergeMB >> >> param. >> >> >> >> On Wed, Jan 13, 2010 at 1:49 PM, Trin Chavalittumrong <mrt...@gmail.com >> > >> >> wrote: >> >> > Thanks, Jason. >> >> > >> >> > Is my understanding correct that >> >> LogByteSizeMergePolicy.setMaxMergeMB(100) >> >> > will prevent >> >> > merging of two segments that is larger than 100 Mb each at the >> optimizing >> >> > time? >> >> > >> >> > If so, why do think would I still see segment that is larger than 200 >> MB? >> >> > >> >> > >> >> > >> >> > On Wed, Jan 13, 2010 at 1:43 PM, Jason Rutherglen < >> >> > jason.rutherg...@gmail.com> wrote: >> >> > >> >> >> Hi Trin, >> >> >> >> >> >> There was recently a discussion about this, the max size is >> >> >> for the before merge segments, rather than the resultant merged >> >> >> segment (if that makes sense). It'd be great if we had a merge >> >> >> policy that limited the resultant merged segment, though that'd >> >> >> by a rough approximation at best. >> >> >> >> >> >> Jason >> >> >> >> >> >> On Wed, Jan 13, 2010 at 1:36 PM, Trin Chavalittumrong < >> mrt...@gmail.com >> >> > >> >> >> wrote: >> >> >> > Hi, >> >> >> > >> >> >> > >> >> >> > >> >> >> > I am trying to optimize the index which would merge different >> segment >> >> >> > together. Let say the index folder is 1Gb in total, I need each >> >> >> segmentation >> >> >> > to be no larger than 200Mb. I tried to use *LogByteSizeMergePolicy >> >> *and >> >> >> > setMaxMergeMB(100) to ensure no segment after merging would be >> 200Mb. >> >> >> > However, I still see segment that are larger than 200Mb. I did call >> >> >> > IndexWriter.optimize(20) to make sure there are enough number >> >> >> segmentation >> >> >> > to allow each segment to be under 200Mb. >> >> >> > >> >> >> > >> >> >> > >> >> >> > Can someone let me know if I am using this right? Or any suggestion >> on >> >> >> how >> >> >> > to tackle this would be helpful. >> >> >> > >> >> >> > >> >> >> > >> >> >> > Thanks, >> >> >> > >> >> >> > Trin >> >> >> > >> >> >> >> >> >> --------------------------------------------------------------------- >> >> >> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >> >> >> For additional commands, e-mail: java-user-h...@lucene.apache.org >> >> >> >> >> >> >> >> > >> >> >> >> --------------------------------------------------------------------- >> >> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >> >> For additional commands, e-mail: java-user-h...@lucene.apache.org >> >> >> >> >> > >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >> For additional commands, e-mail: java-user-h...@lucene.apache.org >> >> > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org