[
https://issues.apache.org/jira/browse/IGNITE-27409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18046816#comment-18046816
]
Roman Puchkovskiy edited comment on IGNITE-27409 at 12/20/25 1:19 PM:
----------------------------------------------------------------------
There are 2 bugs.
# There is a race in write intent replacement code. Existing write intent is
read before taking the WI list lock, so its WI links may be stale. We must take
the lock and then re-read the WI links from the existing WI and use them to
update WI links of the WI list
# When replacing a WI, WI list head is updated incorrectly (not updated, but
it must be) if the replaced WI is the WI list head
was (Author: rpuch):
There are 2 bugs.
# There is a race in write intent replacement code. Existing write intent is
read before taking the WI list lock, so its WI links may be stale. We must take
the lock and then re-read the WI links from the existing WI and use them to
update WI links of the WI list
# When replacing a WI, WI list head is updated incorrectly (not updated, but
it must be) if the replaced WI is pointed to by the WI list head
> Concurrent write intent replacement is broken in aipersist
> ----------------------------------------------------------
>
> Key: IGNITE-27409
> URL: https://issues.apache.org/jira/browse/IGNITE-27409
> Project: Ignite
> Issue Type: Bug
> Reporter: Roman Puchkovskiy
> Assignee: Roman Puchkovskiy
> Priority: Blocker
> Labels: ignite-3
> Time Spent: 10m
> Remaining Estimate: 0h
>
> The following happened:
> {noformat}
> [08:52:42][ERROR][Thread-9] Error inserting batch of keys.
> org.apache.ignite.lang.IgniteException: Error while updating WI prev link of
> next WI: [link=281595236276459, rowId=RowId [partitionId=28,
> uuid=0000019b-2b82-f45c-6c88-bd98aae6c4f2], rowIsTombstone=false,
> txId=019b2b82-e492-0000-ac01-4e9600000001, commitZoneId=20,
> commitPartitionId=0, tableId=21, partitionId=28]
> at java.lang.invoke.MethodHandle.invokeWithArguments(MethodHandle.java:733)
> ~[?:?]
> at
> org.apache.ignite.internal.util.ExceptionUtils$1.copy(ExceptionUtils.java:952)
> ~[ignite-core-3.2.0-SNAPSHOT.jar:?]
> at
> org.apache.ignite.internal.util.ExceptionUtils$ExceptionFactory.createCopy(ExceptionUtils.java:886)
> ~[ignite-core-3.2.0-SNAPSHOT.jar:?]
> at
> org.apache.ignite.internal.util.ExceptionUtils.copyExceptionWithCause(ExceptionUtils.java:688)
> ~[ignite-core-3.2.0-SNAPSHOT.jar:?]
> at
> org.apache.ignite.internal.util.ViewUtils.copyExceptionWithCauseIfPossible(ViewUtils.java:91)
> ~[ignite-core-3.2.0-SNAPSHOT.jar:?]
> at
> org.apache.ignite.internal.util.ViewUtils.ensurePublicException(ViewUtils.java:71)
> ~[ignite-core-3.2.0-SNAPSHOT.jar:?]
> at org.apache.ignite.internal.util.ViewUtils.sync(ViewUtils.java:54)
> ~[ignite-core-3.2.0-SNAPSHOT.jar:?]
> at
> org.apache.ignite.internal.client.table.ClientKeyValueBinaryView.putAll(ClientKeyValueBinaryView.java:252)
> ~[ignite-client-3.2.0-SNAPSHOT.jar:?]
> at
> org.apache.ignite.internal.client.table.api.PublicApiClientKeyValueView.lambda$putAll$14(PublicApiClientKeyValueView.java:123)
> ~[ignite-client-3.2.0-SNAPSHOT.jar:?]
> at
> org.apache.ignite.internal.thread.PublicApiThreading.lambda$execUserSyncOperation$1(PublicApiThreading.java:116)
> ~[ignite-core-3.2.0-SNAPSHOT.jar:?]
> at
> org.apache.ignite.internal.thread.PublicApiThreading.executeWithRole(PublicApiThreading.java:144)
> ~[ignite-core-3.2.0-SNAPSHOT.jar:?]
> at
> org.apache.ignite.internal.thread.PublicApiThreading.execUserSyncOperation(PublicApiThreading.java:102)
> ~[ignite-core-3.2.0-SNAPSHOT.jar:?]
> at
> org.apache.ignite.internal.thread.PublicApiThreading.execUserSyncOperation(PublicApiThreading.java:115)
> ~[ignite-core-3.2.0-SNAPSHOT.jar:?]
> at
> org.apache.ignite.internal.client.table.api.PublicApiClientViewBase.executeSyncOp(PublicApiClientViewBase.java:105)
> ~[ignite-client-3.2.0-SNAPSHOT.jar:?]
> at
> org.apache.ignite.internal.client.table.api.PublicApiClientKeyValueView.putAll(PublicApiClientKeyValueView.java:123)
> ~[ignite-client-3.2.0-SNAPSHOT.jar:?]
> at site.ycsb.db.ignite3.IgniteClient.batchInsert(IgniteClient.java:70)
> [ignite3-binding-2025.13.1.jar:?]
> at site.ycsb.DBWrapper.batchInsert(DBWrapper.java:308) [core-2025.13.1.jar:?]
> at site.ycsb.workloads.CoreWorkload.doInsert(CoreWorkload.java:693)
> [core-2025.13.1.jar:?]
> at site.ycsb.ClientThread.run(ClientThread.java:187) [core-2025.13.1.jar:?]
> at java.lang.Thread.run(Thread.java:1583) [?:?]
> Caused by: org.apache.ignite.lang.IgniteException: Error while updating WI
> prev link of next WI: [link=281595236276459, rowId=RowId [partitionId=28,
> uuid=0000019b-2b82-f45c-6c88-bd98aae6c4f2], rowIsTombstone=false,
> txId=019b2b82-e492-0000-ac01-4e9600000001, commitZoneId=20,
> commitPartitionId=0, tableId=21, partitionId=28]
> at java.lang.invoke.MethodHandle.invokeWithArguments(MethodHandle.java:733)
> ~[?:?]
> at
> org.apache.ignite.internal.util.ExceptionUtils$1.copy(ExceptionUtils.java:952)
> ~[ignite-core-3.2.0-SNAPSHOT.jar:?]
> at
> org.apache.ignite.internal.util.ExceptionUtils$ExceptionFactory.createCopy(ExceptionUtils.java:886)
> ~[ignite-core-3.2.0-SNAPSHOT.jar:?]
> at
> org.apache.ignite.internal.util.ExceptionUtils.copyExceptionWithCause(ExceptionUtils.java:688)
> ~[ignite-core-3.2.0-SNAPSHOT.jar:?]
> at
> org.apache.ignite.internal.util.ViewUtils.copyExceptionWithCauseIfPossible(ViewUtils.java:91)
> ~[ignite-core-3.2.0-SNAPSHOT.jar:?]
> at
> org.apache.ignite.internal.util.ViewUtils.ensurePublicException(ViewUtils.java:71)
> ~[ignite-core-3.2.0-SNAPSHOT.jar:?]
> at
> org.apache.ignite.internal.client.TcpClientChannel.lambda$completeAsync$5(TcpClientChannel.java:473)
> ~[ignite-client-3.2.0-SNAPSHOT.jar:?]
> at
> java.util.concurrent.ForkJoinTask$RunnableExecuteAction.exec(ForkJoinTask.java:1423)
> ~[?:?]
> at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:387) ~[?:?]
> at
> java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1312)
> ~[?:?]
> at java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:1843) ~[?:?]
> at java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1808) ~[?:?]
> at
> java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:188)
> ~[?:?]
> Caused by: org.apache.ignite.lang.IgniteException: Error while updating WI
> prev link of next WI: [link=281595236276459, rowId=RowId [partitionId=28,
> uuid=0000019b-2b82-f45c-6c88-bd98aae6c4f2], rowIsTombstone=false,
> txId=019b2b82-e492-0000-ac01-4e9600000001, commitZoneId=20,
> commitPartitionId=0, tableId=21, partitionId=28]
> at java.lang.invoke.MethodHandle.invokeWithArguments(MethodHandle.java:733)
> ~[?:?]
> at
> org.apache.ignite.internal.util.ExceptionUtils$1.copy(ExceptionUtils.java:952)
> ~[ignite-core-3.2.0-SNAPSHOT.jar:?]
> at
> org.apache.ignite.internal.util.ExceptionUtils$ExceptionFactory.createCopy(ExceptionUtils.java:886)
> ~[ignite-core-3.2.0-SNAPSHOT.jar:?]
> at
> org.apache.ignite.internal.util.ExceptionUtils.copyExceptionWithCause(ExceptionUtils.java:688)
> ~[ignite-core-3.2.0-SNAPSHOT.jar:?]
> at
> org.apache.ignite.internal.client.TcpClientChannel.readError(TcpClientChannel.java:669)
> ~[ignite-client-3.2.0-SNAPSHOT.jar:?]
> at
> org.apache.ignite.internal.client.TcpClientChannel.processNextMessage(TcpClientChannel.java:542)
> ~[ignite-client-3.2.0-SNAPSHOT.jar:?]
> at
> org.apache.ignite.internal.client.TcpClientChannel.onMessage(TcpClientChannel.java:311)
> ~[ignite-client-3.2.0-SNAPSHOT.jar:?]
> at
> org.apache.ignite.internal.client.io.netty.NettyClientConnection.onMessage(NettyClientConnection.java:117)
> ~[ignite-client-3.2.0-SNAPSHOT.jar:?]
> at
> org.apache.ignite.internal.client.io.netty.NettyClientMessageHandler.channelRead(NettyClientMessageHandler.java:33)
> ~[ignite-client-3.2.0-SNAPSHOT.jar:?]
> at
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:356)
> ~[netty-transport-4.2.7.Final.jar:4.2.7.Final]
> at
> io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:346)
> ~[netty-codec-base-4.2.7.Final.jar:4.2.7.Final]
> at
> io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:318)
> ~[netty-codec-base-4.2.7.Final.jar:4.2.7.Final]
> at
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:356)
> ~[netty-transport-4.2.7.Final.jar:4.2.7.Final]
> at
> io.netty.handler.flush.FlushConsolidationHandler.channelRead(FlushConsolidationHandler.java:152)
> ~[netty-handler-4.2.7.Final.jar:4.2.7.Final]
> at
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:354)
> ~[netty-transport-4.2.7.Final.jar:4.2.7.Final]
> at
> io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1429)
> ~[netty-transport-4.2.7.Final.jar:4.2.7.Final]
> at
> io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:918)
> ~[netty-transport-4.2.7.Final.jar:4.2.7.Final]
> at
> io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:168)
> ~[netty-transport-4.2.7.Final.jar:4.2.7.Final]
> at
> io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.handle(AbstractNioChannel.java:445)
> ~[netty-transport-4.2.7.Final.jar:4.2.7.Final]
> at
> io.netty.channel.nio.NioIoHandler$DefaultNioRegistration.handle(NioIoHandler.java:388)
> ~[netty-transport-4.2.7.Final.jar:4.2.7.Final]
> at
> io.netty.channel.nio.NioIoHandler.processSelectedKey(NioIoHandler.java:596)
> ~[netty-transport-4.2.7.Final.jar:4.2.7.Final]
> at
> io.netty.channel.nio.NioIoHandler.processSelectedKeysOptimized(NioIoHandler.java:571)
> ~[netty-transport-4.2.7.Final.jar:4.2.7.Final]
> at
> io.netty.channel.nio.NioIoHandler.processSelectedKeys(NioIoHandler.java:512)
> ~[netty-transport-4.2.7.Final.jar:4.2.7.Final]
> at io.netty.channel.nio.NioIoHandler.run(NioIoHandler.java:484)
> ~[netty-transport-4.2.7.Final.jar:4.2.7.Final]
> at
> io.netty.channel.SingleThreadIoEventLoop.runIo(SingleThreadIoEventLoop.java:225)
> ~[netty-transport-4.2.7.Final.jar:4.2.7.Final]
> at
> io.netty.channel.SingleThreadIoEventLoop.run(SingleThreadIoEventLoop.java:196)
> ~[netty-transport-4.2.7.Final.jar:4.2.7.Final]
> at
> io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:1193)
> ~[netty-common-4.2.7.Final.jar:4.2.7.Final]
> at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
> ~[netty-common-4.2.7.Final.jar:4.2.7.Final]
> at
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
> ~[netty-common-4.2.7.Final.jar:4.2.7.Final]
> ... 1 more
> Caused by: org.apache.ignite.lang.IgniteException:
> org.apache.ignite.lang.IgniteException: IGN-CMN-65535 Error while updating WI
> prev link of next WI: [link=281595236276459, rowId=RowId [partitionId=28,
> uuid=0000019b-2b82-f45c-6c88-bd98aae6c4f2], rowIsTombstone=false,
> txId=019b2b82-e492-0000-ac01-4e9600000001, commitZoneId=20,
> commitPartitionId=0, tableId=21, partitionId=28] TraceId:8936c0c7
> at
> org.apache.ignite.internal.lang.IgniteExceptionMapperUtil.lambda$mapToPublicException$2(IgniteExceptionMapperUtil.java:88)
> at
> org.apache.ignite.internal.lang.IgniteExceptionMapperUtil.mapCheckingResultIsPublic(IgniteExceptionMapperUtil.java:141)
> at
> org.apache.ignite.internal.lang.IgniteExceptionMapperUtil.mapToPublicException(IgniteExceptionMapperUtil.java:137)
> at
> org.apache.ignite.internal.lang.IgniteExceptionMapperUtil.mapToPublicException(IgniteExceptionMapperUtil.java:88)
> at
> org.apache.ignite.internal.lang.IgniteExceptionMapperUtil.lambda$convertToPublicFuture$3(IgniteExceptionMapperUtil.java:178)
> at
> java.base/java.util.concurrent.CompletableFuture.uniHandle(CompletableFuture.java:934)
> at
> java.base/java.util.concurrent.CompletableFuture$UniHandle.tryFire(CompletableFuture.java:911)
> at
> java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:510)
> at
> java.base/java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:2179)
> at
> org.apache.ignite.internal.tx.impl.TransactionInflights$ReadWriteTxContext.completeFinishInProgressFuture(TransactionInflights.java:398)
> at
> org.apache.ignite.internal.tx.impl.TransactionInflights$ReadWriteTxContext.lambda$performFinish$1(TransactionInflights.java:382)
> at
> java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:863)
> at
> java.base/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:841)
> at
> java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:510)
> at
> java.base/java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:2179)
> at
> org.apache.ignite.internal.replicator.ReplicaService.lambda$sendToReplicaRaw$8(ReplicaService.java:264)
> at
> java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:863)
> at
> java.base/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:841)
> at
> java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:510)
> at
> java.base/java.util.concurrent.CompletableFuture.postFire(CompletableFuture.java:614)
> at
> java.base/java.util.concurrent.CompletableFuture$UniHandle.tryFire(CompletableFuture.java:914)
> at
> java.base/java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:482)
> at
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
> at
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
> at java.base/java.lang.Thread.run(Thread.java:1583)
> Caused by: org.apache.ignite.internal.storage.StorageException: IGN-CMN-65535
> Error while updating WI prev link of next WI: [link=281595236276459,
> rowId=RowId [partitionId=28, uuid=0000019b-2b82-f45c-6c88-bd98aae6c4f2],
> rowIsTombstone=false, txId=019b2b82-e492-0000-ac01-4e9600000001,
> commitZoneId=20, commitPartitionId=0, tableId=21, partitionId=28]
> TraceId:8936c0c7
> at
> org.apache.ignite.internal.storage.pagememory.mv.AddWriteLinkingWiInvokeClosure.updateWiListLinks(AddWriteLinkingWiInvokeClosure.java:168)
> at
> org.apache.ignite.internal.storage.pagememory.mv.AddWriteLinkingWiInvokeClosure.insertFirstRowVersion(AddWriteLinkingWiInvokeClosure.java:75)
> at
> org.apache.ignite.internal.storage.pagememory.mv.AddWriteInvokeClosure.call(AddWriteInvokeClosure.java:88)
> at
> org.apache.ignite.internal.storage.pagememory.mv.AddWriteInvokeClosure.call(AddWriteInvokeClosure.java:44)
> at
> org.apache.ignite.internal.pagememory.tree.BplusTree$Invoke.invokeClosure(BplusTree.java:4257)
> at
> org.apache.ignite.internal.pagememory.tree.BplusTree.invokeDown(BplusTree.java:2176)
> at
> org.apache.ignite.internal.pagememory.tree.BplusTree.invokeDown(BplusTree.java:2160)
> at
> org.apache.ignite.internal.pagememory.tree.BplusTree.invokeDown(BplusTree.java:2160)
> at
> org.apache.ignite.internal.pagememory.tree.BplusTree.invokeDown(BplusTree.java:2160)
> at
> org.apache.ignite.internal.pagememory.tree.BplusTree.invoke(BplusTree.java:2082)
> at
> org.apache.ignite.internal.storage.pagememory.mv.AbstractPageMemoryMvPartitionStorage.lambda$addWrite$10(AbstractPageMemoryMvPartitionStorage.java:440)
> at
> org.apache.ignite.internal.storage.pagememory.mv.AbstractPageMemoryMvPartitionStorage.busy(AbstractPageMemoryMvPartitionStorage.java:734)
> at
> org.apache.ignite.internal.storage.pagememory.mv.AbstractPageMemoryMvPartitionStorage.addWrite(AbstractPageMemoryMvPartitionStorage.java:432)
> at
> org.apache.ignite.internal.table.distributed.raft.snapshot.SnapshotAwarePartitionDataStorage.addWrite(SnapshotAwarePartitionDataStorage.java:150)
> at
> org.apache.ignite.internal.table.distributed.StorageUpdateHandler.performAddWrite(StorageUpdateHandler.java:571)
> at
> org.apache.ignite.internal.table.distributed.StorageUpdateHandler.performAddWriteWithCleanup(StorageUpdateHandler.java:546)
> at
> org.apache.ignite.internal.table.distributed.StorageUpdateHandler.tryProcessRow(StorageUpdateHandler.java:224)
> at
> org.apache.ignite.internal.table.distributed.StorageUpdateHandler.lambda$processEntriesUntilBatchLimit$1(StorageUpdateHandler.java:303)
> at
> org.apache.ignite.internal.storage.pagememory.mv.PersistentPageMemoryMvPartitionStorage.lambda$runConsistently$1(PersistentPageMemoryMvPartitionStorage.java:202)
> at
> org.apache.ignite.internal.storage.pagememory.mv.AbstractPageMemoryMvPartitionStorage.busy(AbstractPageMemoryMvPartitionStorage.java:734)
> at
> org.apache.ignite.internal.storage.pagememory.mv.PersistentPageMemoryMvPartitionStorage.runConsistently(PersistentPageMemoryMvPartitionStorage.java:192)
> at
> org.apache.ignite.internal.table.distributed.raft.snapshot.SnapshotAwarePartitionDataStorage.runConsistently(SnapshotAwarePartitionDataStorage.java:89)
> at
> org.apache.ignite.internal.table.distributed.StorageUpdateHandler.processEntriesUntilBatchLimit(StorageUpdateHandler.java:287)
> at
> org.apache.ignite.internal.table.distributed.StorageUpdateHandler.handleUpdateAll(StorageUpdateHandler.java:262)
> at
> org.apache.ignite.internal.table.distributed.replicator.PartitionReplicaListener.applyUpdateAllCommand(PartitionReplicaListener.java:2426)
> at
> org.apache.ignite.internal.table.distributed.replicator.PartitionReplicaListener.applyUpdateAllCommand(PartitionReplicaListener.java:2496)
> at
> org.apache.ignite.internal.table.distributed.replicator.PartitionReplicaListener.lambda$processMultiEntryAction$90(PartitionReplicaListener.java:2054)
> at
> java.base/java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:1150)
> at
> java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:510)
> at
> java.base/java.util.concurrent.CompletableFuture.postFire(CompletableFuture.java:614)
> at
> java.base/java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:653)
> ... 4 more
> Caused by:
> org.apache.ignite.internal.pagememory.freelist.CorruptedFreeListException:
> IGN-STORAGE-2 TraceId:8936c0c7
> at
> org.apache.ignite.internal.pagememory.freelist.PagesList.corruptedFreeListException(PagesList.java:1817)
> at
> org.apache.ignite.internal.pagememory.freelist.PagesList.corruptedFreeListException(PagesList.java:1796)
> at
> org.apache.ignite.internal.pagememory.freelist.FreeListImpl.updateDataRow(FreeListImpl.java:678)
> at
> org.apache.ignite.internal.storage.pagememory.mv.AddWriteLinkingWiInvokeClosure.updateWiListLinks(AddWriteLinkingWiInvokeClosure.java:162)
> ... 34 more
> Caused by: java.lang.AssertionError
> at
> org.apache.ignite.internal.pagememory.freelist.FreeListImpl.updateDataRow(FreeListImpl.java:674)
> ... 35 more
> {noformat}
> It seems to happen due to a concurrency bug with write intent replacement.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)