Vladimir Steshin created IGNITE-19904: -----------------------------------------
Summary: Assertion in defragmentation Key: IGNITE-19904 URL: https://issues.apache.org/jira/browse/IGNITE-19904 Project: Ignite Issue Type: Bug Reporter: Vladimir Steshin Defragmentaion fails with: {code:java} java.lang.AssertionError: Invalid state. Type is 0! pageId = 0001000d00024cbf at org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.copyPageForCheckpoint(PageMemoryImpl.java:1359) ~[ignite-core-2.16.0-SNAPSHOT.jar:2.16.0-SNAPSHOT] at org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.checkpointWritePage(PageMemoryImpl.java:1277) ~[ignite-core-2.16.0-SNAPSHOT.jar:2.16.0-SNAPSHOT] at org.apache.ignite.internal.processors.cache.persistence.checkpoint.CheckpointPagesWriter.writePages(CheckpointPagesWriter.java:208) ~[ignite-core-2.16.0-SNAPSHOT.jar:2.16.0-SNAPSHOT] at org.apache.ignite.internal.processors.cache.persistence.checkpoint.CheckpointPagesWriter.run(CheckpointPagesWriter.java:150) ~[ignite-core-2.16.0-SNAPSHOT.jar:2.16.0-SNAPSHOT] {code} Difficult to write a test. Can't reproduce on my computers :(. Flackly appears on a server (4 core x 4 cpu) with 100G of the test cache data and million+ pages to checkpoint during defragmentation. More often, this occurs with pageSize 1024 (to produce more pages). Regarding my diagnostic build, I suppose that a fresh, empty page is caught in defragmentation. Here is a page dump with test-expented PAGE_OVERHEAD (=64) and same error a bit before copyPageForCheckpoint(): {code:java} org.apache.ignite.IgniteException: Wrong page type in checkpointWritePage1. Page: Data region = 'defragPartitionsDataRegion'. FullPageId [pageId=281878703760205, effectivePageId=403727049549, grpId=-1368047378]. PageDump = page_id: 281878703760205, rel_id: 48603, cache_id: -1368047378, pin: 0, lock: 65536, tmp_buf: 72057594037927935, test_val: 1. data_hex: 00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000 at org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.checkpointWritePage(PageMemoryImpl.java:1240) ~[ignite-core-2.16.0-SNAPSHOT.jar:2.16.0-SNAPSHOT] at org.apache.ignite.internal.processors.cache.persistence.checkpoint.CheckpointPagesWriter.writePages(CheckpointPagesWriter.java:208) ~[ignite-core-2.16.0-SNAPSHOT.jar:2.16.0-SNAPSHOT] at org.apache.ignite.internal.processors.cache.persistence.checkpoint.CheckpointPagesWriter.run(CheckpointPagesWriter.java:150) ~[ignite-core-2.16.0-SNAPSHOT.jar:2.16.0-SNAPSHOT] {code} 'test_val' is my diagnostic page prefix extension. Various numbers are assigned where tmp_buf is assigned. '1' comes from PageHeader::initNew(long absPtr, long relative): {code:java} public static void PageHeader::initNew(long absPtr, long relative) { relative(absPtr, relative); tempBufferPointer(absPtr, PageMemoryImpl.INVALID_REL_PTR); GridUnsafe.putLong(absPtr, PAGE_MARKER); GridUnsafe.putInt(absPtr + PAGE_PIN_CNT_OFFSET, 0); // For diagnostic purposes. PageUtils.setTestTmpValue(absPtr, 1); } public static void PageUtils::setTestTmpValue(long absPtr, int val) { // Next to page overhead putInt(absPtr, PAGE_OVERHEAD - 4, val); } public static final int PageMemoryImpl#PAGE_OVERHEAD = 64; {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)