I really like these conversations. So feel free to continue this one or create a new one Thanks to everyone participating :)
Il giorno dom 16 feb 2020 alle ore 14:04 Reid Pinchback < rpinchb...@tripadvisor.com> ha scritto: > No actually in this case I didn’t really have an opinion because C* is an > architecturally different beast than an RDBMS. That’s kinda what ticked > the curiosity when you made the suggestion about co-locating commit and > data. It raises an interesting question for me. As for the 10 seconds > delay, I’m used to looking at graphite, so bad is relative. 😉 > > > > The question that pops to mind is this. If a commit log isn’t really an > important recovery mechanism…. should one even be part of C* at all? It’s > a lot of code complexity and I/O volume and O/S tuning complexity to worry > about having good I/O resiliency and performance with both commit and data > volumes. > > > > If the proper way to deal with all data volume problems in C* would be to > burn the node (or at least, it’s state) and rebuild via the state of its > neighbours, then repairs (whether administratively triggered, or as a > side-effect of ongoing operations) should always catch up with any > mutations anyways so long as the data is appropriately replicated. The > benefit to the having a commit log would seem limited to data which isn’t > replicated. > > > > However, I shouldn’t derail Sergio’s thread. It just was something that > caught my interest and got me mulling, but it’s a tangent. > > > > *From: *Erick Ramirez <erick.rami...@datastax.com> > *Reply-To: *"user@cassandra.apache.org" <user@cassandra.apache.org> > *Date: *Friday, February 14, 2020 at 9:04 PM > *To: *"user@cassandra.apache.org" <user@cassandra.apache.org> > *Subject: *Re: AWS I3.XLARGE retiring instances advices > > > > *Message from External Sender* > > Erick, a question purely as a point of curiosity. The entire model of a > commit log, historically (speaking in RDBS terms), depended on a notion of > stable store. The idea being that if your data volume lost recent writes, > the failure mode there would be independent of writes to the volume holding > the commit log, so that replay of the commit log could generally be > depended on to recover the missing data. I’d be curious what the C* expert > viewpoint on that would be, with the commit log and data on the same volume. > > > > Those are fair points so thanks for bringing them up. I'll comment from a > personal viewpoint and others can provide their opinions/feedback.👍 > > > > If you think about it, you've lost the data volume -- not just the recent > writes. Replaying the mutations in the commit log is probably insignificant > compared to having to recover the data through various ways (re-bootstrap, > refresh from off-volume/off-server snapshots, etc). The data and > redo/archive logs being on the same volume (in my opinion) is more relevant > in RDBMS since they're mostly deployed on SANs compared to the > nothing-shared architecture of C*. I know that's debatable and others will > have their own view. :) > > > > How about you, Reid? Do you have concerns about both data and commitlog > being on the same disk? And slightly off-topic but by extension, do you > also have concerns about the default commitlog fsync() being 10 seconds? > Cheers! >