On 28.8.15 2:41 , Thomas Mueller wrote:
Hi,

I'm not an expert on compaction, but the differences the current approach
I see are:

* No compaction maps. No memory problem. No persistent compaction map. As
far as I understand, currently you can have _multiple_ compaction map at
the same time. I think that persisting the compaction map is problematic
for code complexity and performance reasons, specially with large
repositories. As for performance, it depends a lot on how the compaction
maps are stored (randomized access patterns will hurt performance a lot).

Good point. This circles back to what Chetan had in mind originally. I.e. doing the mapping via the path. Only that with you approach we would do it on top of the NodeStore as opposed to within the NodeStore.

Memory references: with a restart, old memory references are gone, so the
old segment store can be removed fully, without risk. Right now, at least
with the version of Oak I have tested, 1.2.3 I think, running online
compaction multiple times, each time with a restart, did not shrink the
repository (size is 3 times the size of a fully compacted repo, with very
little writes). Without restart, access to very old objects can result in
a easy to understand exception message.

AFAIK your tests re. restart have been done from within an OSGi container (AEM) restarting the repository bundle. I could easily imagine this not being sufficient to get rid of all the in memory references. There is likely tons of stuff still lurking around even after a bundle restart. You approach would most likely suffer from the same problem.

Overall it seems like a promising approach though. As you pointed out it would be more modular and would help us getting rid of the compaction map. So I'm all +1 for trying it out.

Michael



Regards,
Thomas




On 28/08/15 13:57, "Michael Dürig" <[email protected]> wrote:


AFAIU this is pretty much what we are now doing under the hood. That is,
your proposal would make the compaction step more explicit and visible
above the node store API.
An advantage of your approach is preventing mix segments all together
(i.e. compacted segments still referring to uncompacted ones). This is
something we were having problems with in the past, which I however
believe we have solved by now.
However, you approach most likely suffers from the same problems we
currently have re. contention, performance, in memory references, ...

Michael



On 28.8.15 10:05 , Thomas Mueller wrote:
Hi,

I thought about SegmentStore compaction and made a few slides:

http://www.slideshare.net/ThomasMueller12/multi-store-compaction

Feedback is welcome! The idea is at quite an early stage, so if you
don't understand or agree with some items, I'm to blame.

Regards,
Thomas


Reply via email to