It is highly unusual to use Geode with just a single cache node. A big part of the value of an In-Memory Data Grid is that it can provide fault-tolerance and high-availability for your data. Please consider running at least 3 nodes in your tests as that would be the minimum real-world configuration that Geode would likely be used in.
-- Mike Stolz Principal Engineer, GemFire Product Manager Mobile: 631-835-4771 On Sat, Oct 15, 2016 at 4:59 AM, Kapil Goyal <goy...@vmware.com> wrote: > Thanks Anthony. > > We have already enabled synchronous disk writes to minimize data loss in > the event of crash. > > From: Anthony Baker <aba...@pivotal.io> > Reply-To: <user@geode.incubator.apache.org> > Date: Thursday, October 13, 2016 at 8:31 PM > To: <user@geode.incubator.apache.org> > Subject: Re: GemFire persisted data corruption - how to debug? > > Hi Kapil, > > Geode (by default) writes data synchronously to other cluster members. If > a node crashes like in your test, the update is preserved by the cluster > even in the absence of persistence. Synchronous disk writes can be turned > on (see [1]) but many users prefer to avoid the fsync performance penalty. > > Anthony > > [1] https://cwiki.apache.org/confluence/display/GEODE/ > Native+Disk+Persistence > > On Oct 13, 2016, at 6:46 PM, Kapil Goyal <goy...@vmware.com> wrote: > > Hi Folks, > > I am doing some crash testing with a single cache node of GemFire, where I > power off the VM where cache is running and then bring it back up. Upon > restart, GemFire refuses to come up with this error: > > Caused by: java.lang.NullPointerException > at com.gemstone.gemfire.internal.util.concurrent. > CustomEntryConcurrentHashMap.keyHash(CustomEntryConcurrentHashMap.java:228) > ~[gemfire-8.2.0.2.jar:?] > at com.gemstone.gemfire.internal.cache.AbstractRegionEntry$ > HashRegionEntryCreator.keyHashCode(AbstractRegionEntry.java:934) > ~[gemfire-8.2.0.2.jar:?] > at com.gemstone.gemfire.internal.util.concurrent. > CustomEntryConcurrentHashMap.get(CustomEntryConcurrentHashMap.java:1447) > ~[gemfire-8.2.0.2.jar:?] > at com.gemstone.gemfire.internal.cache.AbstractRegionMap. > getEntry(AbstractRegionMap.java:368) ~[gemfire-8.2.0.2.jar:?] > at com.gemstone.gemfire.internal.cache.AbstractLRURegionMap. > getEntry(AbstractLRURegionMap.java:47) ~[gemfire-8.2.0.2.jar:?] > at com.gemstone.gemfire.internal.cache.PlaceHolderDiskRegion. > getDiskEntry(PlaceHolderDiskRegion.java:93) ~[gemfire-8.2.0.2.jar:?] > at > com.gemstone.gemfire.internal.cache.Oplog.readModifyEntry(Oplog.java:2779) > ~[gemfire-8.2.0.2.jar:?] > at com.gemstone.gemfire.internal.cache.Oplog.readCrf(Oplog.java:1957) > ~[gemfire-8.2.0.2.jar:?] > at > com.gemstone.gemfire.internal.cache.Oplog.recoverCrf(Oplog.java:2270) > ~[gemfire-8.2.0.2.jar:?] > at com.gemstone.gemfire.internal.cache.PersistentOplogSet. > recoverOplogs(PersistentOplogSet.java:459) ~[gemfire-8.2.0.2.jar:?] > at com.gemstone.gemfire.internal.cache.PersistentOplogSet. > recoverRegionsThatAreReady(PersistentOplogSet.java:367) > ~[gemfire-8.2.0.2.jar:?] > at com.gemstone.gemfire.internal.cache.DiskStoreImpl. > recoverRegionsThatAreReady(DiskStoreImpl.java:2065) > ~[gemfire-8.2.0.2.jar:?] > at com.gemstone.gemfire.internal.cache.DiskStoreImpl. > initializeIfNeeded(DiskStoreImpl.java:2052) ~[gemfire-8.2.0.2.jar:?] > at com.gemstone.gemfire.internal.cache.DiskStoreImpl. > doInitialRecovery(DiskStoreImpl.java:2057) ~[gemfire-8.2.0.2.jar:?] > at com.gemstone.gemfire.internal.cache.DiskStoreFactoryImpl. > create(DiskStoreFactoryImpl.java:135) ~[gemfire-8.2.0.2.jar:?] > at com.gemstone.gemfire.internal.cache.xmlcache.CacheCreation. > createDiskStore(CacheCreation.java:650) ~[gemfire-8.2.0.2.jar:?] > at > com.gemstone.gemfire.internal.cache.xmlcache.CacheCreation.create(CacheCreation.java:425) > ~[gemfire-8.2.0.2.jar:?] > at com.gemstone.gemfire.internal.cache.xmlcache.CacheXmlParser. > create(CacheXmlParser.java:331) ~[gemfire-8.2.0.2.jar:?] > at com.gemstone.gemfire.internal.cache.GemFireCacheImpl. > loadCacheXml(GemFireCacheImpl.java:4248) ~[gemfire-8.2.0.2.jar:?] > at > org.springframework.data.gemfire.CacheFactoryBean.init(CacheFactoryBean.java:306) > ~[spring-data-gemfire-1.5.2.RELEASE.jar:1.5.2.RELEASE] > at org.springframework.data.gemfire.CacheFactoryBean. > getObject(CacheFactoryBean.java:455) ~[spring-data-gemfire-1.5.2. > RELEASE.jar:1.5.2.RELEASE] > > It hints at GemFire data on disk being corrupted, so I used 'gfsh' to > verify: > > gfsh>validate offline-disk-store --name=nsxDiskStore > --disk-dirs=/common/nsxapi/data/self > > Validating nsxDiskStore > /nsx_sys/ArrayListIDPriorityModel: entryCount=0 > /nsx_sys/Crl: entryCount=0 > /nsx_sys/Certificate: entryCount=1 > …… > Error in validating disk store nsxDiskStore is : null > > This confirms that the disk-store is corrupted, but doesn't give any more > information to debug this further. How do I go about debugging this? Have > you seen this before and are there any fixes/workarounds available? > > Thanks > Kapil > > >