[jira] [Commented] (CASSANDRA-18125) Improve memtable allocator accounting when updating AtomicBTreePartition

Jon Meredith (Jira) Thu, 02 Mar 2023 13:30:08 -0800


    [ 
https://issues.apache.org/jira/browse/CASSANDRA-18125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17695902#comment-17695902
 ]


Jon Meredith commented on CASSANDRA-18125:
------------------------------------------

Analysis of failures - nothing that looks related to the changes. A little 
suspicious of MemtableSizeTest.testSize[skiplist_sharded], but it is already 
flaky.

Jenkins failures

+*4.0*+

Test Result (6 failures / +6)
 - 
dtest-novnode.repair_tests.incremental_repair_test.TestIncRepair.test_multiple_full_repairs_lcs
900s timeout
 - 
org.apache.cassandra.distributed.upgrade.DropCompactStorageTest.testDropCompactStorage
org.apache.cassandra.exceptions.UnavailableException: Cannot achieve 
consistency level ALL
CASSANDRA-17674 logged against it, but seems like different failure
passed locally, seems unrelated.
 - org.apache.cassandra.streaming.LongStreamingTest.terminated successfully
junit timeout - perhaps too long a streaming test
passes locally - 2mins
 - org.apache.cassandra.utils.LongBloomFilterTest.testConstrained
junit timeout
passes locally - 1m23sec
 - 
org.apache.cassandra.repair.RepairJobTest.testNoTreesRetainedAfterDifference-cdc
j11 module system error - looks like trying to measure something inside JFR 
class.
Unable to make field private final jdk.management.jfr.StreamManager 
jdk.management.jfr.FlightRecorderMXBeanImpl.streamHandler accessible: module 
jdk.management.jfr does not "opens jdk.management.jfr" to unnamed module 
@28975c28
 - 
org.apache.cassandra.db.partition.PartitionImplementationTest.testRowsWithStatic
junit timeout

+*4.1*+

Test Result (3 failures / -3)
 - 
dtest-novnode.transient_replication_test.TestTransientReplicationRepairStreamEntireSSTable.test_transient_incremental_repair
missing Incoming stream entireSSTable=...from log
 - 
org.apache.cassandra.tools.TopPartitionsTest.testServiceTopPartitionsSingleTable
https://issues.apache.org/jira/browse/CASSANDRA-17798
junit.framework.AssertionFailedError: If this failed you probably have to raise 
the beginLocalSampling duration expected:<1> but was:<0>
at 
org.apache.cassandra.tools.TopPartitionsTest.testServiceTopPartitionsSingleTable(TopPartitionsTest.java:83)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native 
Method)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

+*trunk*+

unable to clean tmp on the arm build

CircleCI failures

+*4.0*+

j8 upgrade - org.apache.cassandra.distributed.upgrade.DropCompactStorageTest - 
test shutdown timeout

+*4.1*+

j8 org.apache.cassandra.index.sasi.SASICQLTest - 
testPagingWithClustering-system_keyspace_directory - passes locally, no JIRA, 
nothing in Butler
 * 
 ** 
 *** j11

org.apache.cassandra.cql3.MemtableSizeTest.testSize[skiplist_sharded] - flaky 
test with built in retry. Locally fails intermittently on j11 due to java 
module system exports when trying to measure deep sizes of objects inside the 
java.desktop module - perhaps related to IDEA java agent or something like that.

+*trunk*+

j8 
org.apache.cassandra.db.compaction.CompactionStrategyManagerBoundaryReloadTest
 - https://issues.apache.org/jira/browse/CASSANDRA-18144 - review in progress

 - one shard did not run upgrade tests 
[https://app.circleci.com/pipelines/github/jonmeredith/cassandra/746/workflows/5e928d77-0717-4ae0-ad1d-7883871c7f8e/jobs/5122]

j11 
org.apache.cassandra.db.compaction.CompactionStrategyManagerBoundaryReloadTest 
also failed
same issue as above.

> Improve memtable allocator accounting when updating AtomicBTreePartition
> ------------------------------------------------------------------------
>
>                 Key: CASSANDRA-18125
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-18125
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Local/Memtable
>            Reporter: Nicolas Henneaux
>            Assignee: Benedict Elliott Smith
>            Priority: Normal
>             Fix For: 4.0.x, 4.1.x, 4.x
>
>
> On two nodes (on a 5 nodes cluster) on the cluster I'm running, I got the 
> following exception. It occurred at 3,5 minutes interval.
> {code}
> MemtableReclaimMemory:2625 org.apache.cassandra.service.CassandraDaemon 
> uncaughtException - Exception in thread 
> Thread[MemtableReclaimMemory:2625,5,main]java.lang.AssertionError: null
>       at 
> org.apache.cassandra.utils.memory.MemtablePool$SubPool.released(MemtablePool.java:193)
>       at 
> org.apache.cassandra.utils.memory.MemtableAllocator$SubAllocator.releaseAll(MemtableAllocator.java:151)
>       at 
> org.apache.cassandra.utils.memory.MemtableAllocator$SubAllocator.setDiscarded(MemtableAllocator.java:142)
>       at 
> org.apache.cassandra.utils.memory.MemtableAllocator.setDiscarded(MemtableAllocator.java:93)
>       at 
> org.apache.cassandra.utils.memory.SlabAllocator.setDiscarded(SlabAllocator.java:120)
>       at org.apache.cassandra.db.Memtable.setDiscarded(Memtable.java:201)
>       at 
> org.apache.cassandra.db.ColumnFamilyStore$Flush$1.runMayThrow(ColumnFamilyStore.java:1216)
>       at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
>       at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>       at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>       at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
>       at java.base/java.lang.Thread.run(Thread.java:829)
> {code} 
> {code}
> $ nodetool info
> ID                     : 
> Gossip active          : true
> Native Transport active: true
> Load                   : 204.67 GiB
> Generation No          : 1670343179
> Uptime (seconds)       : 1110514
> Heap Memory (MB)       : 7218.07 / 24576.00
> Off Heap Memory (MB)   : 784.06
> Data Center            : par
> Rack                   : e1
> Exceptions             : 1
> Key Cache              : entries 802712, size 100 MiB, capacity 100 MiB, 
> 774541004 hits, 914207516 requests, 0.847 recent hit rate, 14400 save period 
> in seconds
> Row Cache              : entries 0, size 0 bytes, capacity 0 bytes, 0 hits, 0 
> requests, NaN recent hit rate, 0 save period in seconds
> Counter Cache          : entries 0, size 0 bytes, capacity 50 MiB, 0 hits, 0 
> requests, NaN recent hit rate, 7200 save period in seconds
> Percent Repaired       : 2.3272298419424144E-5%
> Token                  : (invoke with -T/--tokens to see all 8 tokens)
> $ java -version
> openjdk version "11.0.16" 2022-07-19 LTS
> OpenJDK Runtime Environment (Red_Hat-11.0.16.0.8-1.el7_9) (build 
> 11.0.16+8-LTS)
> OpenJDK 64-Bit Server VM (Red_Hat-11.0.16.0.8-1.el7_9) (build 11.0.16+8-LTS, 
> mixed mode, sharing)
> $ nodetool version
> ReleaseVersion: 4.0.6
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (CASSANDRA-18125) Improve memtable allocator accounting when updating AtomicBTreePartition

Reply via email to