[ https://issues.apache.org/jira/browse/CASSANDRA-19703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17932752#comment-17932752 ]
Brad Schoening edited comment on CASSANDRA-19703 at 3/5/25 9:26 PM: -------------------------------------------------------------------- Just a suggestion that it would be helpful to update the documentation on prepared statements to describe the eviction policy used by Caffeine. [https://cassandra.apache.org/doc/stable/cassandra/cql/cql_singlefile.html#preparedStatement] We've seen a slew of problems with miss-prepared statements, for example Spring Data for Cassandra [#1213|https://github.com/spring-projects/spring-data-cassandra/issues/1213]and it's been mysterious when the bad statements would expire after changing the code. was (Author: bschoeni): Just a suggestion that it would be helpful to update the documentation on prepared statements to describe the eviction policy used by Caffeine. [https://cassandra.apache.org/doc/stable/cassandra/cql/cql_singlefile.html#preparedStatement] We've seen a slew of problems with miss-prepared statements, for example [spring data for Cassandra #1213|[https://github.com/spring-projects/spring-data-cassandra/issues/1213],] and it's been mysterious when the bad statements would expire after changing the code. > Newly inserted prepared statements got evicted too early from cache that > leads to race condition > ------------------------------------------------------------------------------------------------ > > Key: CASSANDRA-19703 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19703 > Project: Apache Cassandra > Issue Type: Bug > Reporter: Yuqi Yan > Assignee: Cameron Zemek > Priority: Normal > Fix For: 4.1.x > > Attachments: ci_summary.html > > Time Spent: 10m > Remaining Estimate: 0h > > We're upgrading from Cassandra 4.0 to Cassandra 4.1.3 and > system.prepared_statements table size start growing to GB size after upgrade. > This slows down node startup significantly when it's doing > preloadPreparedStatements > I can't share the exact log but it's a race condition like this: > # [Thread 1] Receives a prepared request for S1. Attempts to get S1 in cache > # [Thread 1] Cache miss, put this S1 into cache > # [Thread 1] Attempts to write S1 into local table > # [Thread 2] Receives a prepared request for S2. Attempts to get S2 in cache > # [Thread 2] Cache miss, put this S2 into cache > # [Thread 2] Cache is full, evicting S1 from cache > # [Thread 2] Attempts to delete S1 from local table > # [Thread 2] Tombstone inserted for S1, delete finished > # [Thread 1] Record inserted for S1, write finished > Thread 2 inserted a tombstone for S1 earlier than Thread 1 was able to insert > the record in the table. Hence the data will not be removed because the later > insert has newer write time than the tombstone. > Whether this would happen or not depends on how the cache decides what’s the > next entry to evict when it’s full. We noticed that in 4.1.3 Caffeine was > upgraded to 2.9.2 CASSANDRA-15153 > > I did a small research in Caffeine commits. It seems this commit was causing > the entry got evicted to early: Eagerly evict an entry if it too large to fit > in the cache(Feb 2021), available after 2.9.0: > [https://github.com/ben-manes/caffeine/commit/464bc1914368c47a0203517fda2151fbedaf568b] > And later fixed in: Improve eviction when overflow or the weight is > oversized(Aug 2022), available after 3.1.2: > [https://github.com/ben-manes/caffeine/commit/25b7d17b1a246a63e4991d4902a2ecf24e86d234] > {quote}Previously an attempt to centralize evictions into one code path led > to a suboptimal approach > ([{{464bc19}}|https://github.com/ben-manes/caffeine/commit/464bc1914368c47a0203517fda2151fbedaf568b] > ). This tried to move those entries into the LRU position for early eviction, > but was confusing and could too aggressively evict something that is > desirable to keep. > {quote} > > I upgrade the Caffeine to 3.1.8 (same as 5.0 trunk) and this issue is gone. > But I think this version is not compatible with Java 8. > I'm not 100% sure if this is the root cause and what's the correct fix here. > Would appreciate if anyone can have a look, thanks > > -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org