Hi, Thanks for your reply. Current limits are highlighted below. As suggested in prev reply, I will change limits and try again.
Limit Soft Limit Hard Limit Units Max cpu time unlimited unlimited seconds Max file size unlimited unlimited bytes Max data size unlimited unlimited bytes Max stack size 8388608 unlimited bytes Max core file size unlimited unlimited bytes Max resident set unlimited unlimited bytes Max processes 63306 63306 processes *Max open files 1024 4096 files * Max locked memory 65536 65536 bytes Max address space unlimited unlimited bytes Max file locks unlimited unlimited locks Max pending signals 63306 63306 signals Max msgqueue size 819200 819200 bytes Max nice priority 0 0 Max realtime priority 0 0 Max realtime timeout unlimited unlimited us On Tue, Jun 7, 2022 at 1:38 PM Gianluca Bonetti <gianluca.bone...@gmail.com> wrote: > Hello > > What is returned by this command? > > # cat /proc/PID/limits > > Cheers > Gianluca > Gianluca > > On Tue, 7 Jun 2022 at 07:35, Surinder Mehra <redni...@gmail.com> wrote: > >> Hi, >> I was going through this post on stackoverflow which is about the same >> issue. The fact that snapshot works for apache ignite bit not in ultimate >> edition indicates there is some bug in later. Could you please confirm. We >> have around 15 caches with 2 backups. I changed backups to zero but still >> see this issue. Could you please advise further. >> >> >> https://stackoverflow.com/questions/72041292/is-there-a-fix-for-too-many-open-files-error-in-gridgain-cluster >> >> On Mon, Jun 6, 2022 at 9:13 PM Surinder Mehra <redni...@gmail.com> wrote: >> >>> Hi, >>> I was experimenting with the GG ultimate edition to take snapshots and >>> encountered the below error and cluster stops. Please note that this works >>> in the ignite free version and we don't see too many files open error. Is >>> this a bug or we are missing some configuration? >>> >>> version: gridgain-8.8.19 >>> >>> /bin./snapshot-utility.sh snapshot -type=full >>> >>> [21:03:51,693][SEVERE][db-snapshot-executor-stripe-0-#35][] Critical >>> system error detected. Will be handled accordingly to configured handler >>> [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, >>> super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet >>> [SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]], >>> failureCtx=FailureContext [type=CRITICAL_ERROR, err=class >>> o.a.i.i.processors.cache.persistence.StorageException: Failed to initialize >>> partition file: >>> /home/usr/tools/gridgain-ultimate-8.8.19/work/db/node00-c221fe71-5d29-4cd7-ab0f-9fa8240711b2/cache-name****/part-88.bin] >>> class >>> org.apache.ignite.internal.processors.cache.persistence.StorageException: >>> Failed to initialize partition file: >>> /home/usr/tools/gridgain-ultimate-8.8.19/work/db/node00-c221fe71-5d29-4cd7-ab0f-9fa8240711b2/cache-name****/part-88.bin >>> at >>> org.apache.ignite.internal.processors.cache.persistence.file.FilePageStore.init(FilePageStore.java:519) >>> at >>> org.apache.ignite.internal.processors.cache.persistence.file.FilePageStore.read(FilePageStore.java:405) >>> at >>> org.apache.ignite.internal.processors.cache.persistence.pagemem.PageReadWriteManagerImpl.read(PageReadWriteManagerImpl.java:68) >>> at >>> org.apache.ignite.internal.processors.cache.persistence.file.FilePageStoreManager.read(FilePageStoreManager.java:577) >>> at >>> org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.acquirePage(PageMemoryImpl.java:911) >>> at >>> org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.acquirePage(PageMemoryImpl.java:730) >>> at >>> org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.acquirePage(PageMemoryImpl.java:711) >>> at >>> org.gridgain.grid.internal.processors.cache.database.snapshot.SnapshotCreateFuture.completeSavingAllocatedIndex(SnapshotCreateFuture.java:1304) >>> at >>> org.gridgain.grid.internal.processors.cache.database.snapshot.SnapshotCreateFuture.completeSnapshotCreation(SnapshotCreateFuture.java:1486) >>> at >>> org.gridgain.grid.internal.processors.cache.database.snapshot.SnapshotCreateFuture.doFinalStage(SnapshotCreateFuture.java:1171) >>> at >>> org.gridgain.grid.internal.processors.cache.database.snapshot.SnapshotOperationFuture.completeStagesLocally(SnapshotOperationFuture.java:2352) >>> at >>> org.gridgain.grid.internal.processors.cache.database.snapshot.SnapshotOperationFuture$10.run(SnapshotOperationFuture.java:2286) >>> at >>> org.apache.ignite.internal.util.StripedExecutor$Stripe.body(StripedExecutor.java:567) >>> at >>> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:119) >>> at java.base/java.lang.Thread.run(Thread.java:829) >>> Caused by: java.nio.file.FileSystemException: >>> /home/usr/tools/gridgain-ultimate-8.8.19/work/db/node00-c221fe71-5d29-4cd7-ab0f-9fa8240711b2/cache-name****/part-88.bin: >>> Too many open files >>> at >>> java.base/sun.nio.fs.UnixException.translateToIOException(UnixException.java:100) >>> at >>> java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:111) >>> at >>> java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:116) >>> at >>> java.base/sun.nio.fs.UnixFileSystemProvider.newAsynchronousFileChannel(UnixFileSystemProvider.java:201) >>> at >>> java.base/java.nio.channels.AsynchronousFileChannel.open(AsynchronousFileChannel.java:253) >>> at >>> java.base/java.nio.channels.AsynchronousFileChannel.open(AsynchronousFileChannel.java:311) >>> at >>> org.apache.ignite.internal.processors.cache.persistence.file.AsyncFileIO.<init>(AsyncFileIO.java:65) >>> at >>> org.apache.ignite.internal.processors.cache.persistence.file.AsyncFileIOFactory.create(AsyncFileIOFactory.java:43) >>> at >>> org.apache.ignite.internal.processors.cache.persistence.file.FilePageStore.init(FilePageStore.java:491) >>> ... 14 more >>> [21:03:51,695][SEVERE][db-snapshot-executor-stripe-0-#35][FailureProcessor] >>> No deadlocked threads detected. >>> [21:03:51,767][SEVERE][db-snapshot-executor-stripe-0-#35][FailureProcessor] >>> Thread dump at 2022/06/06 21:03:51 IST >>> Thread [name="main", id=1, state=WAITING, blockCnt=4, waitCnt=4169] >>> Lock [object=java.util.concurrent.CountDownLatch$Sync@5b60e356, >>> ownerName=null, ownerId=-1] >>> at java.base@11.0.14.1/jdk.internal.misc.Unsafe.park(Native >>> Method) >>> at >>> java.base@11.0.14.1/java.util.concurrent.locks.LockSupport.park(LockSupport.java:194) >>> at >>> java.base@11.0.14.1/java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:885) >>> at >>> java.base@11.0.14.1/java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1039) >>> at >>> java.base@11.0.14.1/java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1345) >>> at >>> java.base@11.0.14.1/java.util.concurrent.CountDownLatch.await(CountDownLatch.java:232) >>> at >>> app//o.a.i.startup.cmdline.CommandLineStartup.main(CommandLineStartup.java:391) >>> >>> >>> Config: >>> >>> <bean class="org.apache.ignite.configuration.IgniteConfiguration"> >>> >>> <property name="peerClassLoadingEnabled" value="true"/> >>> <property name="deploymentMode" value="CONTINUOUS"/> >>> <property name="dataStorageConfiguration"> >>> <bean >>> class="org.apache.ignite.configuration.DataStorageConfiguration"> >>> <property name="defaultDataRegionConfiguration"> >>> <bean >>> class="org.apache.ignite.configuration.DataRegionConfiguration"> >>> <property name="persistenceEnabled" value="true"/> >>> </bean> >>> </property> >>> </bean> >>> </property> >>> <property name="pluginConfigurations"> >>> <bean >>> class="org.gridgain.grid.configuration.GridGainConfiguration"> >>> <property name="snapshotConfiguration"> >>> <bean >>> class="org.gridgain.grid.configuration.SnapshotConfiguration"> >>> <property name="snapshotsPath" >>> value="/home/ignitesnapshots/"/> >>> </bean> >>> </property> >>> </bean> >>> </property> >>> </bean> >>> </beans> >>> >>