[jira] [Updated] (IGNITE-25418) Long write of 1000-rows batches into storage

Denis Chudov (Jira) Mon, 19 May 2025 10:41:04 -0700


     [ 
https://issues.apache.org/jira/browse/IGNITE-25418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Denis Chudov updated IGNITE-25418:
----------------------------------
    Description: 
While investigating "Send with retry timeout" exceptions occurring under the 
load from data streamer, I logged the time of call in 
PartitionListener#handleUpdateAllCommand here:

 
{code:java}
storageUpdateHandler.handleUpdateAll(
        txId,
        cmd.rowsToUpdate(),
        cmd.commitPartitionId().asReplicationGroupId(),
        !cmd.full(),
        () -> storage.lastApplied(commandIndex, commandTerm),
        cmd.full() ? safeTimestamp : null,
        indexIdsAtRwTxBeginTs(catalogService, txId, storage.tableId())
); {code}
At some time, it took over hundreds of milliseconds:

 
{code:java}
[2025-05-19T11:33:22,849][INFO 
][%istod_sdt_3345%JRaft-FSMCaller-Disruptor_stripe_3-0][PartitionListener] qqq 
groupId=19_part_9, storageWriteMcs=341377, hash=-1115824889 {code}
In this line, the number is in microseconds.

Also, I see that the delta files' compaction time is constantly growing (look 
at the two pairs of lines, page number is the same but compaction time grew 
dramatically, as well as the checkpoint fsync time):
{code:java}
2025-03-26 14:46:23:999 +0000 [INFO][%node1%checkpoint-thread][Checkpointer] 
Checkpoint finished [checkpointId=67f35966-b0fd-4377-8e5b-7a3af7d73aa2, 
pages=140807, pagesWriteTime=1132ms, fsyncTime=2200ms, 
replicatorLogSyncTime=126ms, totalTime=3591ms, avgWriteSpeed=613MB/s]
2025-03-26 14:46:26:609 +0000 [INFO][%node1%compaction-thread][Compactor] 
Compaction round finished [compactionId=8fc11431-783f-4111-8e5b-017870110cf5, 
pages=140807, duration=2610ms, avgWriteSpeed=843MB/s]


2025-03-26 15:11:02:160 +0000 [INFO][%node1%checkpoint-thread][Checkpointer] 
Checkpoint finished [checkpointId=2b4d735f-1b09-471a-a47e-3d5c5c8c7f5d, 
pages=165589, pagesWriteTime=1476ms, fsyncTime=6027ms, 
replicatorLogSyncTime=230ms, totalTime=7798ms, avgWriteSpeed=332MB/s]
2025-03-26 15:11:07:276 +0000 [INFO][%node1%compaction-thread][Compactor] 
Compaction round finished [compactionId=6984cff7-e1e1-4ec7-ba5a-5e5d7037898a, 
pages=170116, duration=10213ms, avgWriteSpeed=260MB/s] {code}
I have also seen the plenty of lines like this:
{code:java}
[2025-05-19T11:32:41,268][INFO ][checkpoint-runner-io0][CheckpointPagesWriter] 
Checkpoint pages were not written yet due to unsuccessful page write lock 
acquisition and will be retried [pageCount=1] {code}
*Definition of done*

The reason of long write time and the degradation of compaction time should be 
investigated and fixed, if applicable.

*Notes*

Reproducer is attached, put it into ignite-runner - integrationTests.

 

  was:
While investigating "Send with retry timeout" exceptions occurring under the 
load from data streamer, I logged the time of call in 
PartitionListener#handleUpdateAllCommand here:

 
{code:java}
storageUpdateHandler.handleUpdateAll(
        txId,
        cmd.rowsToUpdate(),
        cmd.commitPartitionId().asReplicationGroupId(),
        !cmd.full(),
        () -> storage.lastApplied(commandIndex, commandTerm),
        cmd.full() ? safeTimestamp : null,
        indexIdsAtRwTxBeginTs(catalogService, txId, storage.tableId())
); {code}
At some time, it took over hundreds of milliseconds:

 
{code:java}
[2025-05-19T11:33:22,849][INFO 
][%istod_sdt_3345%JRaft-FSMCaller-Disruptor_stripe_3-0][PartitionListener] qqq 
groupId=19_part_9, storageWriteMcs=341377, hash=-1115824889 {code}
In this line, the number is in microseconds.

Also, I see that the delta files' compaction time is constantly growing (look 
at the two pairs of lines, page number is the same but compaction time grew 
dramatically, as well as the checkpoint fsync time):
{code:java}
2025-03-26 14:46:23:999 +0000 [INFO][%node1%checkpoint-thread][Checkpointer] 
Checkpoint finished [checkpointId=67f35966-b0fd-4377-8e5b-7a3af7d73aa2, 
pages=140807, pagesWriteTime=1132ms, fsyncTime=2200ms, 
replicatorLogSyncTime=126ms, totalTime=3591ms, avgWriteSpeed=613MB/s]2025-03-26 
14:46:26:609 +0000 [INFO][%node1%compaction-thread][Compactor] Compaction round 
finished [compactionId=8fc11431-783f-4111-8e5b-017870110cf5, pages=140807, 
duration=2610ms, avgWriteSpeed=843MB/s]


2025-03-26 15:11:02:160 +0000 [INFO][%node1%checkpoint-thread][Checkpointer] 
Checkpoint finished [checkpointId=2b4d735f-1b09-471a-a47e-3d5c5c8c7f5d, 
pages=165589, pagesWriteTime=1476ms, fsyncTime=6027ms, 
replicatorLogSyncTime=230ms, totalTime=7798ms, avgWriteSpeed=332MB/s]2025-03-26 
15:11:07:276 +0000 [INFO][%node1%compaction-thread][Compactor] Compaction round 
finished [compactionId=6984cff7-e1e1-4ec7-ba5a-5e5d7037898a, pages=170116, 
duration=10213ms, avgWriteSpeed=260MB/s] {code}
I have also seen the plenty of lines like this:
{code:java}
[2025-05-19T11:32:41,268][INFO ][checkpoint-runner-io0][CheckpointPagesWriter] 
Checkpoint pages were not written yet due to unsuccessful page write lock 
acquisition and will be retried [pageCount=1] {code}
*Definition of done*

The reason of long write time and the degradation of compaction time should be 
investigated and fixed, if applicable.

*Notes*

Reproducer is attached, put it into ignite-runner - integrationTests.

 


> Long write of 1000-rows batches into storage
> --------------------------------------------
>
>                 Key: IGNITE-25418
>                 URL: https://issues.apache.org/jira/browse/IGNITE-25418
>             Project: Ignite
>          Issue Type: Bug
>            Reporter: Denis Chudov
>            Priority: Major
>              Labels: ignite-3
>         Attachments: ItStreamTonsOfData.java
>
>
> While investigating "Send with retry timeout" exceptions occurring under the 
> load from data streamer, I logged the time of call in 
> PartitionListener#handleUpdateAllCommand here:
>  
> {code:java}
> storageUpdateHandler.handleUpdateAll(
>         txId,
>         cmd.rowsToUpdate(),
>         cmd.commitPartitionId().asReplicationGroupId(),
>         !cmd.full(),
>         () -> storage.lastApplied(commandIndex, commandTerm),
>         cmd.full() ? safeTimestamp : null,
>         indexIdsAtRwTxBeginTs(catalogService, txId, storage.tableId())
> ); {code}
> At some time, it took over hundreds of milliseconds:
>  
> {code:java}
> [2025-05-19T11:33:22,849][INFO 
> ][%istod_sdt_3345%JRaft-FSMCaller-Disruptor_stripe_3-0][PartitionListener] 
> qqq groupId=19_part_9, storageWriteMcs=341377, hash=-1115824889 {code}
> In this line, the number is in microseconds.
> Also, I see that the delta files' compaction time is constantly growing (look 
> at the two pairs of lines, page number is the same but compaction time grew 
> dramatically, as well as the checkpoint fsync time):
> {code:java}
> 2025-03-26 14:46:23:999 +0000 [INFO][%node1%checkpoint-thread][Checkpointer] 
> Checkpoint finished [checkpointId=67f35966-b0fd-4377-8e5b-7a3af7d73aa2, 
> pages=140807, pagesWriteTime=1132ms, fsyncTime=2200ms, 
> replicatorLogSyncTime=126ms, totalTime=3591ms, avgWriteSpeed=613MB/s]
> 2025-03-26 14:46:26:609 +0000 [INFO][%node1%compaction-thread][Compactor] 
> Compaction round finished [compactionId=8fc11431-783f-4111-8e5b-017870110cf5, 
> pages=140807, duration=2610ms, avgWriteSpeed=843MB/s]
> 2025-03-26 15:11:02:160 +0000 [INFO][%node1%checkpoint-thread][Checkpointer] 
> Checkpoint finished [checkpointId=2b4d735f-1b09-471a-a47e-3d5c5c8c7f5d, 
> pages=165589, pagesWriteTime=1476ms, fsyncTime=6027ms, 
> replicatorLogSyncTime=230ms, totalTime=7798ms, avgWriteSpeed=332MB/s]
> 2025-03-26 15:11:07:276 +0000 [INFO][%node1%compaction-thread][Compactor] 
> Compaction round finished [compactionId=6984cff7-e1e1-4ec7-ba5a-5e5d7037898a, 
> pages=170116, duration=10213ms, avgWriteSpeed=260MB/s] {code}
> I have also seen the plenty of lines like this:
> {code:java}
> [2025-05-19T11:32:41,268][INFO 
> ][checkpoint-runner-io0][CheckpointPagesWriter] Checkpoint pages were not 
> written yet due to unsuccessful page write lock acquisition and will be 
> retried [pageCount=1] {code}
> *Definition of done*
> The reason of long write time and the degradation of compaction time should 
> be investigated and fixed, if applicable.
> *Notes*
> Reproducer is attached, put it into ignite-runner - integrationTests.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (IGNITE-25418) Long write of 1000-rows batches into storage

Reply via email to