Hi, I was experimenting with the GG ultimate edition to take snapshots and encountered the below error and cluster stops. Please note that this works in the ignite free version and we don't see too many files open error. Is this a bug or we are missing some configuration?
version: gridgain-8.8.19 /bin./snapshot-utility.sh snapshot -type=full [21:03:51,693][SEVERE][db-snapshot-executor-stripe-0-#35][] Critical system error detected. Will be handled accordingly to configured handler [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet [SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]], failureCtx=FailureContext [type=CRITICAL_ERROR, err=class o.a.i.i.processors.cache.persistence.StorageException: Failed to initialize partition file: /home/usr/tools/gridgain-ultimate-8.8.19/work/db/node00-c221fe71-5d29-4cd7-ab0f-9fa8240711b2/cache-name****/part-88.bin] class org.apache.ignite.internal.processors.cache.persistence.StorageException: Failed to initialize partition file: /home/usr/tools/gridgain-ultimate-8.8.19/work/db/node00-c221fe71-5d29-4cd7-ab0f-9fa8240711b2/cache-name****/part-88.bin at org.apache.ignite.internal.processors.cache.persistence.file.FilePageStore.init(FilePageStore.java:519) at org.apache.ignite.internal.processors.cache.persistence.file.FilePageStore.read(FilePageStore.java:405) at org.apache.ignite.internal.processors.cache.persistence.pagemem.PageReadWriteManagerImpl.read(PageReadWriteManagerImpl.java:68) at org.apache.ignite.internal.processors.cache.persistence.file.FilePageStoreManager.read(FilePageStoreManager.java:577) at org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.acquirePage(PageMemoryImpl.java:911) at org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.acquirePage(PageMemoryImpl.java:730) at org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.acquirePage(PageMemoryImpl.java:711) at org.gridgain.grid.internal.processors.cache.database.snapshot.SnapshotCreateFuture.completeSavingAllocatedIndex(SnapshotCreateFuture.java:1304) at org.gridgain.grid.internal.processors.cache.database.snapshot.SnapshotCreateFuture.completeSnapshotCreation(SnapshotCreateFuture.java:1486) at org.gridgain.grid.internal.processors.cache.database.snapshot.SnapshotCreateFuture.doFinalStage(SnapshotCreateFuture.java:1171) at org.gridgain.grid.internal.processors.cache.database.snapshot.SnapshotOperationFuture.completeStagesLocally(SnapshotOperationFuture.java:2352) at org.gridgain.grid.internal.processors.cache.database.snapshot.SnapshotOperationFuture$10.run(SnapshotOperationFuture.java:2286) at org.apache.ignite.internal.util.StripedExecutor$Stripe.body(StripedExecutor.java:567) at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:119) at java.base/java.lang.Thread.run(Thread.java:829) Caused by: java.nio.file.FileSystemException: /home/usr/tools/gridgain-ultimate-8.8.19/work/db/node00-c221fe71-5d29-4cd7-ab0f-9fa8240711b2/cache-name****/part-88.bin: Too many open files at java.base/sun.nio.fs.UnixException.translateToIOException(UnixException.java:100) at java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:111) at java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:116) at java.base/sun.nio.fs.UnixFileSystemProvider.newAsynchronousFileChannel(UnixFileSystemProvider.java:201) at java.base/java.nio.channels.AsynchronousFileChannel.open(AsynchronousFileChannel.java:253) at java.base/java.nio.channels.AsynchronousFileChannel.open(AsynchronousFileChannel.java:311) at org.apache.ignite.internal.processors.cache.persistence.file.AsyncFileIO.<init>(AsyncFileIO.java:65) at org.apache.ignite.internal.processors.cache.persistence.file.AsyncFileIOFactory.create(AsyncFileIOFactory.java:43) at org.apache.ignite.internal.processors.cache.persistence.file.FilePageStore.init(FilePageStore.java:491) ... 14 more [21:03:51,695][SEVERE][db-snapshot-executor-stripe-0-#35][FailureProcessor] No deadlocked threads detected. [21:03:51,767][SEVERE][db-snapshot-executor-stripe-0-#35][FailureProcessor] Thread dump at 2022/06/06 21:03:51 IST Thread [name="main", id=1, state=WAITING, blockCnt=4, waitCnt=4169] Lock [object=java.util.concurrent.CountDownLatch$Sync@5b60e356, ownerName=null, ownerId=-1] at java.base@11.0.14.1/jdk.internal.misc.Unsafe.park(Native Method) at java.base@11.0.14.1/java.util.concurrent.locks.LockSupport.park(LockSupport.java:194) at java.base@11.0.14.1/java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:885) at java.base@11.0.14.1/java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1039) at java.base@11.0.14.1/java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1345) at java.base@11.0.14.1/java.util.concurrent.CountDownLatch.await(CountDownLatch.java:232) at app//o.a.i.startup.cmdline.CommandLineStartup.main(CommandLineStartup.java:391) Config: <bean class="org.apache.ignite.configuration.IgniteConfiguration"> <property name="peerClassLoadingEnabled" value="true"/> <property name="deploymentMode" value="CONTINUOUS"/> <property name="dataStorageConfiguration"> <bean class="org.apache.ignite.configuration.DataStorageConfiguration"> <property name="defaultDataRegionConfiguration"> <bean class="org.apache.ignite.configuration.DataRegionConfiguration"> <property name="persistenceEnabled" value="true"/> </bean> </property> </bean> </property> <property name="pluginConfigurations"> <bean class="org.gridgain.grid.configuration.GridGainConfiguration"> <property name="snapshotConfiguration"> <bean class="org.gridgain.grid.configuration.SnapshotConfiguration"> <property name="snapshotsPath" value="/home/ignitesnapshots/"/> </bean> </property> </bean> </property> </bean> </beans>