Re: [DISCUSS] CEP-45: Mutation Tracking

2025-01-28 Thread Blake Eggleston
Hi dev@, Looks like it's been about 10 days since the last message here. Are there any other comments before I put it up for a vote? Thanks, Blake > On Jan 18, 2025, at 12:33 PM, Blake Eggleston wrote: > > That's an interesting idea. Basically allow for a window of uncertainty > between the

Re: [DISCUSS] CEP-45: Mutation Tracking

2025-01-18 Thread Blake Eggleston
That's an interesting idea. Basically allow for a window of uncertainty between the memtable and log and merge mutations within that window directly into the response. It sounds like something that could work. I'll have to think about how not embedding id info into the storage layer might inter

Re: [DISCUSS] CEP-45: Mutation Tracking

2025-01-18 Thread Benedict
That’s great to hear, I had thought the goal for embedding this information in sstables was that the log could be truncated. If not, is the below snippet the main motivation?For the nodes returning data _and_ mutation ids, the data and mutation ids need to describe each other exactly. If the data r

Re: [DISCUSS] CEP-45: Mutation Tracking

2025-01-18 Thread Blake Eggleston
No, mutations are kept intact. If a node is missing a multi-table mutation, it will receive the entire mutation on reconciliation. Regarding HLCs, I vaguely remember hearing about a paxos outage maybe 9-10 years ago that was related to a leap hour or leap second or something causing clocks to n

Re: [DISCUSS] CEP-45: Mutation Tracking

2025-01-18 Thread Benedict
Does this approach potentially fail to guarantee multi table atomicity? If we’re reconciling mutation ids separately per table, an atomic batch write might get reconciled for one table but not another? I know that atomic batch updates on a single partition key to multiple tables is an important pro

Re: [DISCUSS] CEP-45: Mutation Tracking

2025-01-17 Thread Blake Eggleston
Hi Jon, thanks for the excellent questions, answers below > Write Path - for recovery, how does a node safely recover the highest hybrid > logical clock it has issued? Checking the last entry in the > addressable log is insufficient unless we ensure every individual update is > durable, rather t

Re: [DISCUSS] CEP-45: Mutation Tracking

2025-01-17 Thread Jon Meredith
I had another read through for the CEP and had some follow up questions/thoughts. Write Path - for recovery, how does a node safely recover the highest hybrid logical clock it has issued? Checking the last entry in the addressable log is insufficient unless we ensure every individual update is dur

Re: [DISCUSS] CEP-45: Mutation Tracking

2025-01-16 Thread Blake Eggleston
I’m not sure Josh. Jon brought up paging and the documentation around it because our docs say we provide mutation level atomicity, but we also provide drivers that page transparently. So from the user’s perspective, a single “query” breaks this guarantee unpredictably. Occasional exceptions with

Re: [DISCUSS] CEP-45: Mutation Tracking

2025-01-16 Thread Blake Eggleston
Thanks Jake! Honestly I’m not familiar with the Aurora paper, but will check it out. The CEP doesn’t prefer to reconcile on read, but read reconciliation is a requirement so that’s outlined there. There is also a background process that continuously reconciles writes between peers and maintain

Re: [DISCUSS] CEP-45: Mutation Tracking

2025-01-16 Thread Josh McKenzie
> The other issue is that there isn’t a time bound on the paging payload, so if > the application is taking long enough between pages that the log has been > truncated, we’d have to throw an exception. My hot-take is that this relationship between how long you're taking to page, how much data yo

Re: [DISCUSS] CEP-45: Mutation Tracking

2025-01-16 Thread Jake Luciani
This is very cool! I have done a POC that was similar but more akin to Aurora paper whereby the commitlog itself would repair itself from peers proactively using the seekable commitlog. Can you explain the reason you prefer to reconcile on read? Having a consistent commitlog would solve so many

Re: [DISCUSS] CEP-45: Mutation Tracking

2025-01-16 Thread Blake Eggleston
I’ve been thinking about the paging atomicity issue. I think it could be fixed with mutation tracking and without having to support full on MVCC. When we reach a page boundary, we can send the highest mutation id we’ve seen for the partition we reached the paging boundary on. When we request ano

Re: [DISCUSS] CEP-45: Mutation Tracking

2025-01-09 Thread Blake Eggleston
So the ids themselves are in the memtable and are accessible as soon as they’re written, and need to be for the read path to work. We’re not able to reconcile the ids until we can guarantee that they won’t be merged with unreconciled data, that’s why they’re flushed before reconciliation. > On

Re: [DISCUSS] CEP-45: Mutation Tracking

2025-01-09 Thread Josh McKenzie
> We also can't remove mutation ids until they've been reconciled, so in the > simplest implementation, we'd need to flush a memtable before reconciling, > and there would never be a situation where you have purgeable mutation ids in > the memtable. Got it. So effectively that data would be unre

Re: [DISCUSS] CEP-45: Mutation Tracking

2025-01-09 Thread Blake Eggleston
Hi Josh, You can think of reconciliation as analogous to incremental repair. Like incremental repair, you can't mix reconciled/unreconciled data without causing problem. We also can't remove mutation ids until they've been reconciled, so in the simplest implementation, we'd need to flush a memt

Re: [DISCUSS] CEP-45: Mutation Tracking

2025-01-09 Thread Blake Eggleston
Hi Guo and Chris >> Does it support changing the table to support mutation tracking through >> ALTER TABLE if it does not support mutation tracking before? Yes, migration for existing keyspaces/tables will be supported. >> Do you think that keyspace_inherit (or other keywords that clearly expl

Re: [DISCUSS] CEP-45: Mutation Tracking

2025-01-09 Thread Josh McKenzie
Question re: Log Truncation (emphasis mine): > When the cluster is operating normally, logs entries can be discarded once > they are older than the last reconciliation time of their

Re: [DISCUSS] CEP-45: Mutation Tracking

2025-01-09 Thread Chris Lohfink
Is this something we can disable? I can see scenarios where this would be strictly and severely worse then existing scenarios where we don't need repairs. ie short time window data, millions of writes a second that get thrown out after a few hours. If that data is small partitions we are nearly dou

Re: [DISCUSS] CEP-45: Mutation Tracking

2025-01-08 Thread guo Maxwell
After a brief understanding, there are 2 questions from me, If I ask something inappropriate, please feel free to correct me : 1、 Does it support changing the table to support mutation tracking through ALTER TABLE if it does not support mutation tracking before? 2、 > Available options for tables

Re: [DISCUSS] CEP-45: Mutation Tracking

2025-01-08 Thread J. D. Jordan
Your pagination case is not a violation of any guarantees Cassandra makes. It has never made guarantees across multiple queries.Trying to have MVCC/consistent data across multiple queries is a very different issue/problem from this CEP.  If you want to have a discussion about MVCC I suggest creatin

Re: [DISCUSS] CEP-45: Mutation Tracking

2025-01-08 Thread Jon Haddad
JD, the fact that pagination is implemented as multiple queries is a design choice. A user performs a query with fetch size 1 or 100 and they will get different behavior. I'm not asking for anyone to implement MVCC. I'm asking for the docs around this to be correct. We should not use the term g

Re: [DISCUSS] CEP-45: Mutation Tracking

2025-01-08 Thread Dmitry Konstantinov
>> 2) What is a granularity of storing mutation ids in memtable, is it per cell? It would be per-partition I suppose we have a kind of trade-off here: granularity of such metadata vs probability of read repair in some cases.. An example: if there is a big enough partition (like a time slot) to whi

Re: [DISCUSS] CEP-45: Mutation Tracking

2025-01-08 Thread Jon Haddad
> It's true that we can't offer multi-page write atomicity without some sort of MVCC. There are a lot of common query patterns that don't involve paging though, so it's not like the benefit of fixing write atomicity would only apply to a small subset of carefully crafted queries or something. Sure

Re: [DISCUSS] CEP-45: Mutation Tracking

2025-01-08 Thread Blake Eggleston
Thanks Dimitry and Jon, answers below > 1) Is a single separate commit log expected to be created for all tables with > the new replication type? The plan is to still have a single commit log, but only index mutations with a mutation id. > 2) What is a granularity of storing mutation ids in m

Re: [DISCUSS] CEP-45: Mutation Tracking

2025-01-08 Thread Jon Haddad
Very cool! I'll need to spent some time reading this over. One thing I did notice is this: > Cassandra promises partition level write atomicity. This means that, although writes are eventually consistent, a given write will either be visible or not visible. You're not supposed to see a partially

Re: [DISCUSS] CEP-45: Mutation Tracking

2025-01-08 Thread Dmitry Konstantinov
Hello Blake, thank you a lot for sharing the CEP, it looks really promising and should address many of the current pain points! I have a few questions to clarify: 1) Is a single separate commit log expected to be created for all tables with the new replication type? 2) What is a granularity of stor

[DISCUSS] CEP-45: Mutation Tracking

2025-01-08 Thread Blake Eggleston
Hello dev@, We'd like to propose CEP-45: Mutation Tracking for adoption by the community. CEP-45 proposes adding a replication mechanism to track and reconcile individual mutations, as well as processes to actively reconcile missing mutations. For keyspaces with mutation tracking enabled, the