Re: gridgain ultimate edition snapshot error

Surinder Mehra Tue, 07 Jun 2022 01:35:50 -0700

Hi,
Thanks for your reply. Current limits are highlighted below. As suggested
in prev reply, I will change limits and try again.


Limit                     Soft Limit           Hard Limit           Units

Max cpu time              unlimited            unlimited            seconds

Max file size             unlimited            unlimited            bytes

Max data size             unlimited            unlimited            bytes

Max stack size            8388608              unlimited            bytes

Max core file size        unlimited            unlimited            bytes

Max resident set          unlimited            unlimited            bytes

Max processes             63306                63306
 processes
*Max open files            1024                 4096                 files
  *
Max locked memory         65536                65536                bytes

Max address space         unlimited            unlimited            bytes

Max file locks            unlimited            unlimited            locks

Max pending signals       63306                63306                signals

Max msgqueue size         819200               819200               bytes

Max nice priority         0                    0
Max realtime priority     0                    0
Max realtime timeout      unlimited            unlimited            us

On Tue, Jun 7, 2022 at 1:38 PM Gianluca Bonetti <[email protected]>
wrote:

> Hello
>
> What is returned by this command?
>
> # cat /proc/PID/limits
>
> Cheers
> Gianluca
> Gianluca
>
> On Tue, 7 Jun 2022 at 07:35, Surinder Mehra <[email protected]> wrote:
>
>> Hi,
>> I was going through this post on stackoverflow which is about the same
>> issue. The fact that snapshot works for apache ignite bit not in ultimate
>> edition indicates there is some bug in later. Could you please confirm. We
>> have around 15 caches with 2 backups. I changed backups to zero but still
>> see this issue. Could you please advise further.
>>
>>
>> https://stackoverflow.com/questions/72041292/is-there-a-fix-for-too-many-open-files-error-in-gridgain-cluster
>>
>> On Mon, Jun 6, 2022 at 9:13 PM Surinder Mehra <[email protected]> wrote:
>>
>>> Hi,
>>> I was experimenting with the GG ultimate edition to take snapshots and
>>> encountered the below error and cluster stops. Please note that this works
>>> in the ignite free version and we don't see too many files open error. Is
>>> this a bug or we are missing some configuration?
>>>
>>> version:  gridgain-8.8.19
>>>
>>> /bin./snapshot-utility.sh snapshot -type=full
>>>
>>> [21:03:51,693][SEVERE][db-snapshot-executor-stripe-0-#35][] Critical
>>> system error detected. Will be handled accordingly to configured handler
>>> [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0,
>>> super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet
>>> [SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]],
>>> failureCtx=FailureContext [type=CRITICAL_ERROR, err=class
>>> o.a.i.i.processors.cache.persistence.StorageException: Failed to initialize
>>> partition file:
>>> /home/usr/tools/gridgain-ultimate-8.8.19/work/db/node00-c221fe71-5d29-4cd7-ab0f-9fa8240711b2/cache-name****/part-88.bin]
>>> class
>>> org.apache.ignite.internal.processors.cache.persistence.StorageException:
>>> Failed to initialize partition file:
>>> /home/usr/tools/gridgain-ultimate-8.8.19/work/db/node00-c221fe71-5d29-4cd7-ab0f-9fa8240711b2/cache-name****/part-88.bin
>>> at
>>> org.apache.ignite.internal.processors.cache.persistence.file.FilePageStore.init(FilePageStore.java:519)
>>> at
>>> org.apache.ignite.internal.processors.cache.persistence.file.FilePageStore.read(FilePageStore.java:405)
>>> at
>>> org.apache.ignite.internal.processors.cache.persistence.pagemem.PageReadWriteManagerImpl.read(PageReadWriteManagerImpl.java:68)
>>> at
>>> org.apache.ignite.internal.processors.cache.persistence.file.FilePageStoreManager.read(FilePageStoreManager.java:577)
>>> at
>>> org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.acquirePage(PageMemoryImpl.java:911)
>>> at
>>> org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.acquirePage(PageMemoryImpl.java:730)
>>> at
>>> org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.acquirePage(PageMemoryImpl.java:711)
>>> at
>>> org.gridgain.grid.internal.processors.cache.database.snapshot.SnapshotCreateFuture.completeSavingAllocatedIndex(SnapshotCreateFuture.java:1304)
>>> at
>>> org.gridgain.grid.internal.processors.cache.database.snapshot.SnapshotCreateFuture.completeSnapshotCreation(SnapshotCreateFuture.java:1486)
>>> at
>>> org.gridgain.grid.internal.processors.cache.database.snapshot.SnapshotCreateFuture.doFinalStage(SnapshotCreateFuture.java:1171)
>>> at
>>> org.gridgain.grid.internal.processors.cache.database.snapshot.SnapshotOperationFuture.completeStagesLocally(SnapshotOperationFuture.java:2352)
>>> at
>>> org.gridgain.grid.internal.processors.cache.database.snapshot.SnapshotOperationFuture$10.run(SnapshotOperationFuture.java:2286)
>>> at
>>> org.apache.ignite.internal.util.StripedExecutor$Stripe.body(StripedExecutor.java:567)
>>> at
>>> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:119)
>>> at java.base/java.lang.Thread.run(Thread.java:829)
>>> Caused by: java.nio.file.FileSystemException:
>>> /home/usr/tools/gridgain-ultimate-8.8.19/work/db/node00-c221fe71-5d29-4cd7-ab0f-9fa8240711b2/cache-name****/part-88.bin:
>>> Too many open files
>>> at
>>> java.base/sun.nio.fs.UnixException.translateToIOException(UnixException.java:100)
>>> at
>>> java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:111)
>>> at
>>> java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:116)
>>> at
>>> java.base/sun.nio.fs.UnixFileSystemProvider.newAsynchronousFileChannel(UnixFileSystemProvider.java:201)
>>> at
>>> java.base/java.nio.channels.AsynchronousFileChannel.open(AsynchronousFileChannel.java:253)
>>> at
>>> java.base/java.nio.channels.AsynchronousFileChannel.open(AsynchronousFileChannel.java:311)
>>> at
>>> org.apache.ignite.internal.processors.cache.persistence.file.AsyncFileIO.<init>(AsyncFileIO.java:65)
>>> at
>>> org.apache.ignite.internal.processors.cache.persistence.file.AsyncFileIOFactory.create(AsyncFileIOFactory.java:43)
>>> at
>>> org.apache.ignite.internal.processors.cache.persistence.file.FilePageStore.init(FilePageStore.java:491)
>>> ... 14 more
>>> [21:03:51,695][SEVERE][db-snapshot-executor-stripe-0-#35][FailureProcessor]
>>> No deadlocked threads detected.
>>> [21:03:51,767][SEVERE][db-snapshot-executor-stripe-0-#35][FailureProcessor]
>>> Thread dump at 2022/06/06 21:03:51 IST
>>> Thread [name="main", id=1, state=WAITING, blockCnt=4, waitCnt=4169]
>>>     Lock [object=java.util.concurrent.CountDownLatch$Sync@5b60e356,
>>> ownerName=null, ownerId=-1]
>>>         at [email protected]/jdk.internal.misc.Unsafe.park(Native
>>> Method)
>>>         at
>>> [email protected]/java.util.concurrent.locks.LockSupport.park(LockSupport.java:194)
>>>         at
>>> [email protected]/java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:885)
>>>         at
>>> [email protected]/java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1039)
>>>         at
>>> [email protected]/java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1345)
>>>         at
>>> [email protected]/java.util.concurrent.CountDownLatch.await(CountDownLatch.java:232)
>>>         at
>>> app//o.a.i.startup.cmdline.CommandLineStartup.main(CommandLineStartup.java:391)
>>>
>>>
>>> Config:
>>>
>>>   <bean class="org.apache.ignite.configuration.IgniteConfiguration">
>>>
>>>     <property name="peerClassLoadingEnabled" value="true"/>
>>>     <property name="deploymentMode" value="CONTINUOUS"/>
>>>     <property name="dataStorageConfiguration">
>>>       <bean
>>> class="org.apache.ignite.configuration.DataStorageConfiguration">
>>>         <property name="defaultDataRegionConfiguration">
>>>           <bean
>>> class="org.apache.ignite.configuration.DataRegionConfiguration">
>>>             <property name="persistenceEnabled" value="true"/>
>>>           </bean>
>>>         </property>
>>>       </bean>
>>>     </property>
>>>     <property name="pluginConfigurations">
>>>       <bean
>>> class="org.gridgain.grid.configuration.GridGainConfiguration">
>>>         <property name="snapshotConfiguration">
>>>           <bean
>>> class="org.gridgain.grid.configuration.SnapshotConfiguration">
>>>             <property name="snapshotsPath"
>>> value="/home/ignitesnapshots/"/>
>>>           </bean>
>>>         </property>
>>>       </bean>
>>>     </property>
>>>   </bean>
>>> </beans>
>>>
>>

Re: gridgain ultimate edition snapshot error

Reply via email to