[ 
https://issues.apache.org/jira/browse/CASSANDRA-20490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17952035#comment-17952035
 ] 

Stefan Miklosovic commented on CASSANDRA-20490:
-----------------------------------------------

Bunch of failures but nothing related to this PR at least ... it is what it is.

 

[CASSANDRA-20490|https://github.com/instaclustr/cassandra/tree/CASSANDRA-20490]
{noformat}
java17_pre-commit_tests                         
  ✓ j17_build                                       10m 21s
  ✓ j17_cqlsh_dtests_py311                           7m 37s
  ✓ j17_cqlsh_dtests_py311_vnode                     7m 44s
  ✓ j17_cqlsh_dtests_py38                             7m 8s
  ✓ j17_cqlsh_dtests_py38_vnode                      7m 27s
  ✓ j17_cqlshlib_cython_tests                       12m 50s
  ✓ j17_cqlshlib_tests                               9m 20s
  ✓ j17_dtests_latest                               43m 44s
  ✓ j17_dtests_vnode                                43m 36s
  ✓ j17_jvm_dtests_latest_vnode_repeat           1h 15m 25s
  ✓ j17_jvm_dtests_repeat                        1h 15m 28s
  ✓ j17_unit_tests_repeat                            7m 29s
  ✓ j17_utests_latest_repeat                         8m 26s
  ✓ j17_utests_oa_repeat                             7m 34s
  ✕ j17_dtests                                       51m 5s
      refresh_test.TestRefresh test_refresh_deadlock_startup
  ✕ j17_jvm_dtests                                  30m 18s
      org.apache.cassandra.fuzz.topology.HarryOnAccordTopologyMixupTest test
      org.apache.cassandra.fuzz.sai.MultiNodeSAITest indexOnlySaiTest TIMEOUTED
  ✕ j17_jvm_dtests_latest_vnode                     30m 58s
      junit.framework.TestSuite 
org.apache.cassandra.distributed.test.accord.MigrationFromAccordReadRaceTest
      junit.framework.TestSuite 
org.apache.cassandra.distributed.test.accord.MigrationFromAccordReadRaceTest
      junit.framework.TestSuite 
org.apache.cassandra.distributed.test.accord.MigrationFromAccordWriteRaceTest
      junit.framework.TestSuite 
org.apache.cassandra.distributed.test.accord.MigrationFromAccordWriteRaceTest
      junit.framework.TestSuite 
org.apache.cassandra.distributed.test.accord.MigrationToAccordWriteRaceTest
      junit.framework.TestSuite 
org.apache.cassandra.distributed.test.accord.MigrationToAccordWriteRaceTest
      junit.framework.TestSuite 
org.apache.cassandra.distributed.test.accord.AccordMigrationTest
      junit.framework.TestSuite 
org.apache.cassandra.distributed.test.accord.AccordMigrationTest
  ✕ j17_unit_tests                                  17m 49s
      org.apache.cassandra.io.sstable.SSTableReaderTest 
testSpannedIndexPositions TIMEOUTED
  ✕ j17_utests_latest                               18m 20s
      org.apache.cassandra.db.lifecycle.LogTransactionTest 
testStatsTSMismatchDuringStart
      org.apache.cassandra.db.lifecycle.LogTransactionTest 
testWrongTimestampInTxnFile
      org.apache.cassandra.db.lifecycle.LogTransactionTest 
testStatsTSMismatchDuringList
  ✕ j17_utests_oa                                   17m 54s
      org.apache.cassandra.io.sstable.SSTableReaderTest 
testSpannedIndexPositions TIMEOUTED
java11_pre-commit_tests                         
  ✓ j11_build                                       10m 24s
  ✓ j11_cqlsh_dtests_py311                            9m 1s
  ✓ j11_cqlsh_dtests_py311_vnode                     8m 38s
  ✓ j11_cqlsh_dtests_py38                             7m 3s
  ✓ j11_cqlsh_dtests_py38_vnode                      8m 15s
  ✓ j11_cqlshlib_cython_tests                        9m 50s
  ✓ j11_cqlshlib_tests                              11m 35s
  ✓ j11_dtests_latest                               45m 51s
  ✓ j11_dtests_vnode                                42m 30s
  ✓ j11_jvm_dtests_latest_vnode_repeat           1h 15m 47s
  ✓ j11_jvm_dtests_repeat                        1h 13m 41s
  ✓ j11_unit_tests_repeat                            8m 42s
  ✓ j11_utests_latest_repeat                         8m 26s
  ✓ j11_utests_oa_repeat                             7m 44s
  ✓ j11_utests_system_keyspace_directory_repeat      7m 56s
  ✓ j17_cqlsh_dtests_py311                           6m 50s
  ✓ j17_cqlsh_dtests_py311_vnode                     7m 32s
  ✓ j17_cqlsh_dtests_py38                            6m 55s
  ✓ j17_cqlsh_dtests_py38_vnode                      7m 38s
  ✓ j17_cqlshlib_cython_tests                        9m 25s
  ✓ j17_cqlshlib_tests                               8m 42s
  ✓ j17_dtests_latest                               42m 30s
  ✓ j17_dtests_vnode                                45m 39s
  ✓ j17_jvm_dtests_latest_vnode_repeat           1h 13m 36s
  ✓ j17_jvm_dtests_repeat                        1h 13m 16s
  ✓ j17_unit_tests_repeat                             8m 4s
  ✓ j17_utests_latest_repeat                         8m 15s
  ✓ j17_utests_oa_repeat                            11m 45s
    j11_dtests                                      50m 51s
  ✕ j11_jvm_dtests                                  31m 59s
      org.apache.cassandra.fuzz.sai.MultiNodeSAITest indexOnlySaiTest TIMEOUTED
  ✕ j11_jvm_dtests_latest_vnode                     32m 27s
      junit.framework.TestSuite 
org.apache.cassandra.distributed.test.accord.MigrationFromAccordReadRaceTest
      junit.framework.TestSuite 
org.apache.cassandra.distributed.test.accord.MigrationFromAccordReadRaceTest
      junit.framework.TestSuite 
org.apache.cassandra.distributed.test.accord.MigrationFromAccordWriteRaceTest
      junit.framework.TestSuite 
org.apache.cassandra.distributed.test.accord.MigrationFromAccordWriteRaceTest
      junit.framework.TestSuite 
org.apache.cassandra.distributed.test.accord.MigrationToAccordWriteRaceTest
      junit.framework.TestSuite 
org.apache.cassandra.distributed.test.accord.MigrationToAccordWriteRaceTest
      junit.framework.TestSuite 
org.apache.cassandra.distributed.test.accord.AccordMigrationTest
      junit.framework.TestSuite 
org.apache.cassandra.distributed.test.accord.AccordMigrationTest
      
org.apache.cassandra.distributed.test.cql3.MixedReadsAccordInteropMultiNodeTokenConflictTest
 test
  ✕ j11_simulator_dtests                            20m 16s
      org.apache.cassandra.simulator.test.AccordHarrySimulationTest test
      org.apache.cassandra.simulator.test.HarrySimulatorTest test
  ✕ j11_unit_tests                                   20m 8s
      org.apache.cassandra.io.sstable.SSTableReaderTest 
testSpannedIndexPositions TIMEOUTED
      org.apache.cassandra.transport.AuthMessageSizeLimitTest 
sendTooBigAuthMultiFrameMessage
  ✕ j11_utests_latest                               19m 26s
      org.apache.cassandra.db.lifecycle.LogTransactionTest 
testStatsTSMismatchDuringStart
      org.apache.cassandra.db.lifecycle.LogTransactionTest 
testWrongTimestampInTxnFile
      org.apache.cassandra.db.lifecycle.LogTransactionTest 
testStatsTSMismatchDuringList
  ✕ j11_utests_oa                                   19m 11s
      org.apache.cassandra.io.sstable.SSTableReaderTest 
testSpannedIndexPositions TIMEOUTED
  ✕ j11_utests_system_keyspace_directory             21m 2s
      org.apache.cassandra.io.sstable.SSTableReaderTest 
testSpannedIndexPositions TIMEOUTED
  ✕ j17_dtests                                      38m 16s
      refresh_test.TestRefresh test_refresh_deadlock_startup
  ✕ j17_jvm_dtests                                  30m 35s
      org.apache.cassandra.fuzz.sai.MultiNodeSAITest indexOnlySaiTest TIMEOUTED
  ✕ j17_jvm_dtests_latest_vnode                      30m 7s
      junit.framework.TestSuite 
org.apache.cassandra.distributed.test.accord.MigrationFromAccordReadRaceTest
      junit.framework.TestSuite 
org.apache.cassandra.distributed.test.accord.MigrationFromAccordReadRaceTest
      junit.framework.TestSuite 
org.apache.cassandra.distributed.test.accord.MigrationFromAccordWriteRaceTest
      junit.framework.TestSuite 
org.apache.cassandra.distributed.test.accord.MigrationFromAccordWriteRaceTest
      junit.framework.TestSuite 
org.apache.cassandra.distributed.test.accord.MigrationToAccordWriteRaceTest
      junit.framework.TestSuite 
org.apache.cassandra.distributed.test.accord.MigrationToAccordWriteRaceTest
      junit.framework.TestSuite 
org.apache.cassandra.distributed.test.accord.AccordMigrationTest
      junit.framework.TestSuite 
org.apache.cassandra.distributed.test.accord.AccordMigrationTest
      org.apache.cassandra.fuzz.sai.MultiNodeSAITest indexOnlySaiTest TIMEOUTED
  ✕ j17_unit_tests                                   18m 5s
      org.apache.cassandra.net.ConnectionTest testTimeout
      org.apache.cassandra.io.sstable.SSTableReaderTest 
testSpannedIndexPositions TIMEOUTED
  ✕ j17_utests_latest                               18m 10s
      org.apache.cassandra.db.lifecycle.LogTransactionTest 
testStatsTSMismatchDuringStart
      org.apache.cassandra.db.lifecycle.LogTransactionTest 
testWrongTimestampInTxnFile
      org.apache.cassandra.db.lifecycle.LogTransactionTest 
testStatsTSMismatchDuringList
  ✕ j17_utests_oa                                    19m 2s
      org.apache.cassandra.io.sstable.SSTableReaderTest 
testSpannedIndexPositions TIMEOUTED                   
{noformat}
[java17_pre-commit_tests|https://app.circleci.com/pipelines/github/instaclustr/cassandra/5805/workflows/036728ed-e5b0-4e89-8cb0-c9e98d57ec4d]
[java11_pre-commit_tests|https://app.circleci.com/pipelines/github/instaclustr/cassandra/5805/workflows/d6df2016-76b9-489f-b445-ff736fd159c1]

> Encountred "duplicate hardlink error" when repairing
> ----------------------------------------------------
>
>                 Key: CASSANDRA-20490
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-20490
>             Project: Apache Cassandra
>          Issue Type: Bug
>          Components: Consistency/Repair
>            Reporter: Stefan Miklosovic
>            Assignee: Stefan Miklosovic
>            Priority: Normal
>             Fix For: 5.x
>
>          Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> A user reported:
> Hi all,
> we experience the following issue when executing full sequential repairs in 
> Cassandra 4.0.10.
> ERROR [RepairSnapshotExecutor:1] 2023-11-07 13:22:50,267 
> CassandraDaemon.java:581 - Exception in thread 
> Thread[RepairSnapshotExecutor:1,5,main]
> java.lang.RuntimeException: Tried to create duplicate hard link to 
> /opt/ddb/data/pool/data1/test_keyspace/test1-c4b33340f0a211edb0cb2fb04a4be304/snapshots/bec3dba0-7d70-11ee-99d3-7bda513c2b90/nb-1-big-Filter.db
> at org.apache.cassandra.io.util.FileUtils.createHardLink(FileUtils.java:185)
> at 
> org.apache.cassandra.io.sstable.format.SSTableReader.createLinks(SSTableReader.java:1624)
> at 
> org.apache.cassandra.io.sstable.format.SSTableReader.createLinks(SSTableReader.java:1606)
> at 
> org.apache.cassandra.db.ColumnFamilyStore.snapshotWithoutFlush(ColumnFamilyStore.java:1852)
> at 
> org.apache.cassandra.db.ColumnFamilyStore.snapshot(ColumnFamilyStore.java:2031)
> at 
> org.apache.cassandra.db.ColumnFamilyStore.snapshot(ColumnFamilyStore.java:2017)
> at 
> org.apache.cassandra.db.repair.CassandraTableRepairManager.lambda$snapshot$0(CassandraTableRepairManager.java:74)
> at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
> at java.util.concurrent.FutureTask.run(Unknown Source)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
> at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
> at java.lang.Thread.run(Unknown Source)
> ERROR [AntiEntropyStage:1] 2023-11-07 13:22:50,267 CassandraDaemon.java:581 - 
> Exception in thread Thread[AntiEntropyStage:1,5,main]
> java.lang.RuntimeException: java.lang.RuntimeException: Unable to take a 
> snapshot bec3dba0-7d70-11ee-99d3-7bda513c2b90 on test_keyspace/test1
> This behavior is reproduced consistently, when the following are true:
> * It is a normal sequential repair (--full and --sequential),
> * It is not a global repair, meaning at least one datacenter is defined 
> (--in-dc or --in-local-dc),
> * The repair affects more than two Cassandra nodes.
> For more than two Cassandra nodes the parent repair session consists of 
> multiple separate repair sessions towards different target endpoints. Full 
> sequential repairs require that all participants flush and snapshot the data 
> before starting the repair. Unfortunately, there is a collision between the 
> separate repair sessions. The first one creates the ephemeral snapshot 
> successfully and the second one that tries to create a snapshot (create hard 
> link) in the same node fails with the above error.
> This issue is not seen in global repairs, where datacenters and hosts are not 
> defined, because in that case there is an explicit check if a snapshot 
> already exists before proceeding.
> I found a few issues in Jira about duplicate hard links, but all of them are 
> from older versions and seem irrelevant to this one. Could you please help 
> with this issue?
> Thank you,
> Panagiotis
> https://lists.apache.org/thread/kwz89po5gkx68bhof7l7o0yykz48bnbw



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to