*"I am wondering if your multiple-entrylogs approach is making things complicated. I have been thinking there can be a simpler approach achieving the same goal: for example, having a ledger storage comprised of N interleaved/sorted ledger storages, which they share same LedgerCache, but having different memtables (for sortedledgerstore) and different entry log files. "*
First of all I'm not convinced with the statement that dealing MultipleEntryLogs at EntryLogger is more complicated. As far as I understood, dealing with composition (containing N) at LedgerStorage level is even more complicated than dealing at EntryLogger - The crux of the complication in the approach described in https://issues.apache.org/jira/browse/BOOKKEEPER-1041 is maintenance of state information (mapping of ledgerid to slotid) and in the event of ledgerdir becoming full updating that mapping to available slot. The same amount of complication will be applicable even in the case of composition at LedgerStorage level. When we get addEntry at ComposedLedgerStorage there should be logic to which LedgerStorage it should goto. So with this fact I can say that composing at LedgerStorage level it will be atleast as complicated as composing at EntryLogger level. - with the ComposedLedgerStorage approach (Composite Pattern), ComposedLedgerStorage managing N InterleavedLedgerStorage/SortedLedgerStorages, there is going to be considerable change in the way the resources and state information are handled. Each Interleaved/SortedLedgerStorage will have its own EntryLogger (to serve our MultipleEntryLogs feature), but rest all needs to be shared, mainly 'LedgerCache', 'gcThread' and 'activeLedgers'. So there is going to be major churn in the code for moving the operations dealing with shared resources and state to ComposedLedgerStorage and leave the rest in InterleavedLedgerStorage. Amount of changes required in testcode to deal with these changes is even more. We have to do this while providing backward compatibility (Single EntryLog). Now this should make one question where should the composition happen. As far as I can say, it is supposed to happen at the lowest possible level rather than at higher level, which needs extra band-aid efforts to deal separately for the common resources/state and multiplexable resources. - with composition at EntryLogger level, currently in my implementation getEntry/readEntry path is mostly unchanged. but with composition at LedgerStorage even for readEntry/getEntry the multiplexing/redirection needs to happen. - and mainly LedgerStorage is not the only consumer of EntryLogger, but also GarbageCollectorThread. It calls quite a few methods of EntryLogger - (entryLogger.addEntry, entryLogger.flush, entryLogger.removeEntryLog, entryLogger.scanEntryLog, entryLogger.getLeastUnflushedLogId, entryLogger.logExists, entryLogger.getEntryLogMetadata). It is going to be an issue with ComposedLedgerStorage, because with that EntryLogger is going to be responsible for just one EntryLog. For GarbageCollectorThread to work correctly with multiple EntryLogs, composition is required here as well. Which is double whammy when it comes down to implementation. - I remember Sijie mentioning that there is WI to improve GarbageCollectorThread compaction logic, to segregate the entries of ledgers based on lifespan while compacting entrylogs. With the correct implementation of MultipleEntryLogs, the logic/implementation of it shouldn't be affected, but with ComposedLedgerStorage it will have the same issues as I mentioned in the above point. Being said that, one difference what ComposedLedgerStorage could make is each SortedLedgerStorage will have its own EntryMemtable. With the current implementation of MultipleEntryLogs, it is going to be just one MemTable. Now the question is how is going to be beneficial to have multiple EntryMemTables? Andrey mentioned that current EntrymemTable's snapshot flush processes entries sequentially, but this can be parallelized by having ThreadPoolExecutor for processing the entries and also Sijie confirmed that synchrnoization is not needed for InterleavedLedgerStorage.processEntry and InterleavedLedgerStorage.addEntry. Size of the EntryMemTable is already configurable. And mainly with MultipleEntryLogs there is not much need of SortedLedgerStorage, because entries of different ledgers are inherently segregated and there wont be much Interleaving even if we use InterleavedLedgerStorage. Also, InterleavedLedgerStorage helps in overcoming unpredictable tail latency caused by SortedLedgerStorage' EntryMemTable flush. @sijie Code is almost ready for pull request, I just need to do some final touches, in the meantime you may check the code at - https://github.com/reddycharan/bookkeeper/tree/multipleentrylogs . It is rebased to current code. It should be good. Thanks, Charan On Mon, Jul 17, 2017 at 2:25 PM, Venkateswara Rao Jujjuri <jujj...@gmail.com > wrote: > > > On Fri, Jul 14, 2017 at 6:00 PM, Sijie Guo <guosi...@gmail.com> wrote: > >> >> >> On Sat, Jul 15, 2017 at 8:06 AM, Charan Reddy G <reddychara...@gmail.com> >> wrote: >> >>> Hey, >>> >>> In InterleavedLedgerStorage, since the initial version of it ( >>> https://github.com/apache/bookkeeper/commit/4a94ce1d8184f5f >>> 38def015d80777a8113b96690 and https://github.com/apache/book >>> keeper/commit/d175ada58dcaf78f0a70b0ebebf489255ae67b5f), addEntry and >>> processEntry methods are synchronized. If it is synchronized then I dont >>> get what is the point in having 'writeThreadPool' in >>> BookieRequestProcessor, if anyhow they are going to be executed >>> sequentially because of synchronized addEntry method in >>> InterleavedLedgerStorage. >>> >> >> When InterleavedLedgerStore is used in the context of SortedLedgerStore, >> the addEntry and processEntry are only called when flushing happened. The >> flushing happens in background thread, which is effectively running >> sequentially. But adding to the memtable happens concurrently. >> >> The reason of having 'writeThreadPool' is more on separating writes and >> reads into different thread pools. so writes will not be affected by reads. >> In the context of SortedLedgerStore, the 'writeThreadPool' adds the >> concurrency. >> >> >>> >>> If we look at the implementation of addEntry and processEntry method, >>> 'somethingWritten' can be made threadsafe by using AtomicBoolean, >>> ledgerCache.updateLastAddConfirmed and entryLogger.addEntry methods are >>> inherently threadsafe. >>> >>> I'm not sure about semantics of ledgerCache.putEntryOffset method here. >>> I'm not confident enough to say if LedgerCacheImpl and IndexInMemPageMgr >>> (and probably IndexPersistenceMgr) are thread-safe classes. >>> >> >> LedgerCacheImpl and IndexInMemPageMgr are thread-safe classes. You can >> confirm this from SortedLedgerStorage. >> >> >>> >>> As far as I understood, if ledgerCache.putEntryOffset is thread safe, >>> then I dont see the need of synchronization for those methods. In any case, >>> if they are not thread-safe can you please say why it is not thread-safe >>> and how we can do more granular synchronization at LedgerCacheImpl level, >>> so that we can remove the need of synchrnoization at >>> InterleavedLedgerStorage level. >>> >> >> I don't see any reason why we can't remove the synchronization. >> >> >>> >>> I'm currently working on Multiple Entrylogs - >>> https://issues.apache.org/jira/browse/BOOKKEEPER-1041. >>> >> >> I am wondering if your multiple-entrylogs approach is making things >> complicated. I have been thinking there can be a simpler approach achieving >> the same goal: for example, having a ledger storage comprised of N >> interleaved/sorted ledger storages, which they share same LedgerCache, but >> having different memtables (for sortedledgerstore) and different entry log >> files. >> > > This is more cleaner approach. @charan can you comment? > > JV > > >> >> >>> To reap the benefits of multipleentrylogs feature from performance >>> perspective, this synchrnoization should be taken care or atleast bring it >>> down to more granular synchronization (if possible). >>> >>> @Override >>> synchronized public long addEntry(ByteBuffer entry) throws >>> IOException { >>> long ledgerId = entry.getLong(); >>> long entryId = entry.getLong(); >>> long lac = entry.getLong(); >>> entry.rewind(); >>> processEntry(ledgerId, entryId, entry); >>> ledgerCache.updateLastAddConfirmed(ledgerId, lac); >>> return entryId; >>> } >>> >>> synchronized protected void processEntry(long ledgerId, long >>> entryId, ByteBuffer entry, boolean rollLog) >>> throws IOException { >>> somethingWritten = true; >>> long pos = entryLogger.addEntry(ledgerId, entry, rollLog); >>> ledgerCache.putEntryOffset(ledgerId, entryId, pos); >>> } >>> >>> Thanks, >>> Charan >>> >> >> > > > -- > Jvrao > --- > First they ignore you, then they laugh at you, then they fight you, then > you win. - Mahatma Gandhi > > >