Well ... yes and no?

Yes, the Log*MergePolicy will still at certain times merge the index all the way down to one segment. If mergeFactor is 10 then this will happen every "power of 10" flushed segments. Ie, after 10 flushes a merge will merge them down to 1 segment, then after 100 flushes as well, 1000 flushes, etc. These are done in the background with ConcurrentMergeScheduler. I don't really like this quality of the merge policy ... it's sort of a "pay it forward" approach. You are paying in advance for the expectation that the index is going to keep getting larger.

However, this merging does respect maxMergeMB, so if you've set that, then it will not merge down to 1 segment once you have segments over that size. So in this aspect it's different from a real call to optimize.

Mike

Mark Miller wrote:

Thanks a lot Mike...one more question:

I remember reading that a regular addDocument call could basically trigger an optimize on a given call. Is this true? Maybe not true anymore?

It doesnt sound right to me, but I do remember reading about it. This was pre background merging when it was mentioned the addDocument call could take a long time to return if basically the equiv of an optmize was triggered.

Could you clear this up for me?

Thanks
Mark

Michael McCandless wrote:

Yes this should reduce transient (while merging) disk usage. However, optimize disregards this parameter, so it will still use the same disk space. However, if you call optimize(N) then that should use less space since it does not merge all the way down to 1 segment.

Note that the limit applies to segments-to-be-merged not to the final merged segment size. Ie, any segment > maxMergeMB will never be merged, but at any given time you can easily have segments quite a bit larger than maxMergeMB.

Mike

Mark Miller wrote:

If I use LogByteSizeMergePolicy#setMaxMergeMB, can I clamp down on the space needed for optimize/merge? My thought is, if a segment is maxed out, it will never need to be copied for a merge right? So you could significantly reduce merge/optimize space requirments (now at like 2x-4x if readers can still open)?

- Mark

-------------------------------------------------------------------- -
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to