Hello there

Yes it's not a bug.
The open files limit is the default and needs to be raised.
Maybe apply a ulimit action in your start script.

Very likely it worked for you on Ignite vs GridGain because you're using
Ignite on another machine (a testing vm?) where you have less caches hence
less files.

Cheers
Gianluca

On Tue, 7 Jun 2022 at 10:35, Surinder Mehra <redni...@gmail.com> wrote:

> Hi,
> Thanks for your reply. Current limits are highlighted below. As suggested
> in prev reply, I will change limits and try again.
>
> Limit                     Soft Limit           Hard Limit           Units
>
> Max cpu time              unlimited            unlimited
>  seconds
> Max file size             unlimited            unlimited            bytes
>
> Max data size             unlimited            unlimited            bytes
>
> Max stack size            8388608              unlimited            bytes
>
> Max core file size        unlimited            unlimited            bytes
>
> Max resident set          unlimited            unlimited            bytes
>
> Max processes             63306                63306
>  processes
> *Max open files            1024                 4096                 files
>   *
> Max locked memory         65536                65536                bytes
>
> Max address space         unlimited            unlimited            bytes
>
> Max file locks            unlimited            unlimited            locks
>
> Max pending signals       63306                63306
>  signals
> Max msgqueue size         819200               819200               bytes
>
> Max nice priority         0                    0
> Max realtime priority     0                    0
> Max realtime timeout      unlimited            unlimited            us
>
> On Tue, Jun 7, 2022 at 1:38 PM Gianluca Bonetti <
> gianluca.bone...@gmail.com> wrote:
>
>> Hello
>>
>> What is returned by this command?
>>
>> # cat /proc/PID/limits
>>
>> Cheers
>> Gianluca
>> Gianluca
>>
>> On Tue, 7 Jun 2022 at 07:35, Surinder Mehra <redni...@gmail.com> wrote:
>>
>>> Hi,
>>> I was going through this post on stackoverflow which is about the same
>>> issue. The fact that snapshot works for apache ignite bit not in ultimate
>>> edition indicates there is some bug in later. Could you please confirm. We
>>> have around 15 caches with 2 backups. I changed backups to zero but still
>>> see this issue. Could you please advise further.
>>>
>>>
>>> https://stackoverflow.com/questions/72041292/is-there-a-fix-for-too-many-open-files-error-in-gridgain-cluster
>>>
>>> On Mon, Jun 6, 2022 at 9:13 PM Surinder Mehra <redni...@gmail.com>
>>> wrote:
>>>
>>>> Hi,
>>>> I was experimenting with the GG ultimate edition to take snapshots and
>>>> encountered the below error and cluster stops. Please note that this works
>>>> in the ignite free version and we don't see too many files open error. Is
>>>> this a bug or we are missing some configuration?
>>>>
>>>> version:  gridgain-8.8.19
>>>>
>>>> /bin./snapshot-utility.sh snapshot -type=full
>>>>
>>>> [21:03:51,693][SEVERE][db-snapshot-executor-stripe-0-#35][] Critical
>>>> system error detected. Will be handled accordingly to configured handler
>>>> [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0,
>>>> super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet
>>>> [SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]],
>>>> failureCtx=FailureContext [type=CRITICAL_ERROR, err=class
>>>> o.a.i.i.processors.cache.persistence.StorageException: Failed to initialize
>>>> partition file:
>>>> /home/usr/tools/gridgain-ultimate-8.8.19/work/db/node00-c221fe71-5d29-4cd7-ab0f-9fa8240711b2/cache-name****/part-88.bin]
>>>> class
>>>> org.apache.ignite.internal.processors.cache.persistence.StorageException:
>>>> Failed to initialize partition file:
>>>> /home/usr/tools/gridgain-ultimate-8.8.19/work/db/node00-c221fe71-5d29-4cd7-ab0f-9fa8240711b2/cache-name****/part-88.bin
>>>> at
>>>> org.apache.ignite.internal.processors.cache.persistence.file.FilePageStore.init(FilePageStore.java:519)
>>>> at
>>>> org.apache.ignite.internal.processors.cache.persistence.file.FilePageStore.read(FilePageStore.java:405)
>>>> at
>>>> org.apache.ignite.internal.processors.cache.persistence.pagemem.PageReadWriteManagerImpl.read(PageReadWriteManagerImpl.java:68)
>>>> at
>>>> org.apache.ignite.internal.processors.cache.persistence.file.FilePageStoreManager.read(FilePageStoreManager.java:577)
>>>> at
>>>> org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.acquirePage(PageMemoryImpl.java:911)
>>>> at
>>>> org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.acquirePage(PageMemoryImpl.java:730)
>>>> at
>>>> org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.acquirePage(PageMemoryImpl.java:711)
>>>> at
>>>> org.gridgain.grid.internal.processors.cache.database.snapshot.SnapshotCreateFuture.completeSavingAllocatedIndex(SnapshotCreateFuture.java:1304)
>>>> at
>>>> org.gridgain.grid.internal.processors.cache.database.snapshot.SnapshotCreateFuture.completeSnapshotCreation(SnapshotCreateFuture.java:1486)
>>>> at
>>>> org.gridgain.grid.internal.processors.cache.database.snapshot.SnapshotCreateFuture.doFinalStage(SnapshotCreateFuture.java:1171)
>>>> at
>>>> org.gridgain.grid.internal.processors.cache.database.snapshot.SnapshotOperationFuture.completeStagesLocally(SnapshotOperationFuture.java:2352)
>>>> at
>>>> org.gridgain.grid.internal.processors.cache.database.snapshot.SnapshotOperationFuture$10.run(SnapshotOperationFuture.java:2286)
>>>> at
>>>> org.apache.ignite.internal.util.StripedExecutor$Stripe.body(StripedExecutor.java:567)
>>>> at
>>>> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:119)
>>>> at java.base/java.lang.Thread.run(Thread.java:829)
>>>> Caused by: java.nio.file.FileSystemException:
>>>> /home/usr/tools/gridgain-ultimate-8.8.19/work/db/node00-c221fe71-5d29-4cd7-ab0f-9fa8240711b2/cache-name****/part-88.bin:
>>>> Too many open files
>>>> at
>>>> java.base/sun.nio.fs.UnixException.translateToIOException(UnixException.java:100)
>>>> at
>>>> java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:111)
>>>> at
>>>> java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:116)
>>>> at
>>>> java.base/sun.nio.fs.UnixFileSystemProvider.newAsynchronousFileChannel(UnixFileSystemProvider.java:201)
>>>> at
>>>> java.base/java.nio.channels.AsynchronousFileChannel.open(AsynchronousFileChannel.java:253)
>>>> at
>>>> java.base/java.nio.channels.AsynchronousFileChannel.open(AsynchronousFileChannel.java:311)
>>>> at
>>>> org.apache.ignite.internal.processors.cache.persistence.file.AsyncFileIO.<init>(AsyncFileIO.java:65)
>>>> at
>>>> org.apache.ignite.internal.processors.cache.persistence.file.AsyncFileIOFactory.create(AsyncFileIOFactory.java:43)
>>>> at
>>>> org.apache.ignite.internal.processors.cache.persistence.file.FilePageStore.init(FilePageStore.java:491)
>>>> ... 14 more
>>>> [21:03:51,695][SEVERE][db-snapshot-executor-stripe-0-#35][FailureProcessor]
>>>> No deadlocked threads detected.
>>>> [21:03:51,767][SEVERE][db-snapshot-executor-stripe-0-#35][FailureProcessor]
>>>> Thread dump at 2022/06/06 21:03:51 IST
>>>> Thread [name="main", id=1, state=WAITING, blockCnt=4, waitCnt=4169]
>>>>     Lock [object=java.util.concurrent.CountDownLatch$Sync@5b60e356,
>>>> ownerName=null, ownerId=-1]
>>>>         at java.base@11.0.14.1/jdk.internal.misc.Unsafe.park(Native
>>>> Method)
>>>>         at
>>>> java.base@11.0.14.1/java.util.concurrent.locks.LockSupport.park(LockSupport.java:194)
>>>>         at
>>>> java.base@11.0.14.1/java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:885)
>>>>         at
>>>> java.base@11.0.14.1/java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1039)
>>>>         at
>>>> java.base@11.0.14.1/java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1345)
>>>>         at
>>>> java.base@11.0.14.1/java.util.concurrent.CountDownLatch.await(CountDownLatch.java:232)
>>>>         at
>>>> app//o.a.i.startup.cmdline.CommandLineStartup.main(CommandLineStartup.java:391)
>>>>
>>>>
>>>> Config:
>>>>
>>>>   <bean class="org.apache.ignite.configuration.IgniteConfiguration">
>>>>
>>>>     <property name="peerClassLoadingEnabled" value="true"/>
>>>>     <property name="deploymentMode" value="CONTINUOUS"/>
>>>>     <property name="dataStorageConfiguration">
>>>>       <bean
>>>> class="org.apache.ignite.configuration.DataStorageConfiguration">
>>>>         <property name="defaultDataRegionConfiguration">
>>>>           <bean
>>>> class="org.apache.ignite.configuration.DataRegionConfiguration">
>>>>             <property name="persistenceEnabled" value="true"/>
>>>>           </bean>
>>>>         </property>
>>>>       </bean>
>>>>     </property>
>>>>     <property name="pluginConfigurations">
>>>>       <bean
>>>> class="org.gridgain.grid.configuration.GridGainConfiguration">
>>>>         <property name="snapshotConfiguration">
>>>>           <bean
>>>> class="org.gridgain.grid.configuration.SnapshotConfiguration">
>>>>             <property name="snapshotsPath"
>>>> value="/home/ignitesnapshots/"/>
>>>>           </bean>
>>>>         </property>
>>>>       </bean>
>>>>     </property>
>>>>   </bean>
>>>> </beans>
>>>>
>>>

Reply via email to