Hi Kapil, I haven't seen this particular issue before. Do you have any stack traces of the exception from the OplogCompactor? What sort of operations are you doing? puts?
-Dan On Tue, Oct 4, 2016 at 1:16 AM, Kapil Goyal <goy...@vmware.com> wrote: > Hi All, > > We have been testing a single cache node with a lot of data recently and > frequently run into this error: > > [info 2016/09/29 06:16:06.823 UTC <OplogCompactor nsxDiskStore for oplog > oplog#6> tid=0x19] OplogCompactor for nsxDiskStore compaction oplog id(s): > oplog#6 > [info 2016/09/29 06:16:08.232 UTC <OplogCompactor nsxDiskStore for oplog > oplog#6> tid=0x19] compaction did 6,310 creates and updates in 1,408 ms > [info 2016/09/29 06:16:08.248 UTC <Oplog Delete Task> tid=0x19] Deleted > oplog#6 crf for disk store nsxDiskStore. > [info 2016/09/29 06:16:08.256 UTC <Oplog Delete Task> tid=0x19] Deleted > oplog#6 krf for disk store nsxDiskStore. > [info 2016/09/29 06:16:08.256 UTC <Oplog Delete Task> tid=0x19] Deleted > oplog#6 drf for disk store nsxDiskStore. > [info 2016/09/29 06:17:03.887 UTC <Event Processor for > GatewaySender_AsyncEventQueue_txLogEventQueue> tid=0x19] Created oplog#8 drf > for disk store nsxDiskStore. > [info 2016/09/29 06:17:03.911 UTC <Event Processor for > GatewaySender_AsyncEventQueue_txLogEventQueue> tid=0x19] Created oplog#8 crf > for disk store nsxDiskStore. > [info 2016/09/29 06:17:04.031 UTC <Idle OplogCompactor> tid=0x19] Created > oplog#7 krf for disk store nsxDiskStore. > [info 2016/09/29 06:17:04.314 UTC <OplogCompactor nsxDiskStore for oplog > oplog#7> tid=0x19] OplogCompactor for nsxDiskStore compaction oplog id(s): > oplog#7 > [error 2016/09/29 06:17:16.075 UTC <OplogCompactor nsxDiskStore for oplog > oplog#7> tid=0x19] A DiskAccessException has occurred while writing to the > disk for disk store nsxDiskStore. The cache will be closed. > ?com.gemstone.gemfire.cache.DiskAccessException: For DiskStore: > nsxDiskStore: Failed writing key to > "/common/nsxapi/data/self/BACKUPnsxDiskStore_7", caused by > java.io.IOException: Stream Closed ?at > com.gemstone.gemfire.internal.cache.Oplog.flushAll(Oplog.java:5235) > > From the logs it appears there may be a race between threads "Idle > OplogCompactor" and "OplogCompactor nsxDiskStore for oplog oplog#7". I see > that both are doing operations related to oplog#7. The former logs creation > of a KRF file, while the latter is trying to access either the DRF or the > CRF file. Now, is it possible that "Idle OplogCompactor" closed the DRF/CRF > files for oplog#7 as part of creating the KRF for the same? This is what > GemFire docs say about it: > > "After the oplog is closed, GemFire also attempts to create a krf file, > which contains the key names as well as the offset for the value within the > crf file." > > Based on the above, it's possible that oplog#7 was already closed and its > KRF was already created, when the compactor tried to access the files. > > Have any of you run into this error before? Any suggestions? > > Thanks > Kapil