There I was referring to making operations across multiple (logical) tables atomic and isolated, as opposed to splitting static and non-static at flush (which is not particularly tricky)
On Fri, May 1, 2015 at 5:03 PM, graham sanderson <gra...@vast.com> wrote: > Naively (I may be missing something) it seems much easier to flush a > single memtable to more than one stable on disk (static and non static) and > then allow for separate compaction of those > > > On May 1, 2015, at 9:06 AM, Benedict Elliott Smith < > belliottsm...@datastax.com> wrote: > > > > It also doesn't solve the atomicity problem, which is its own challenge. > We > > would probably need to merge the memtables for the entire keyspace/node, > > and split them out into their own sstables on flush. Or introduce mutual > > exclusion at the partition key level for the node. > > > > On Fri, May 1, 2015 at 3:01 PM, Jonathan Ellis <jbel...@gmail.com> > wrote: > > > >> I'm down for adding JOIN support within a partition, eventually. I can > see > >> a lot of stuff I'd rather prioritize higher in the short term though. > >> > >> On Fri, May 1, 2015 at 8:44 AM, Jonathan Haddad <j...@jonhaddad.com> > wrote: > >> > >>> I think what Benedict has described feels very much like a very > >> specialized > >>> version of the following: > >>> > >>> 1. Updates to different tables in a batch become atomic if the node is > a > >>> replica for the partition > >>> 2. Supporting Inner joins if the partition key is the same in both > >> tables. > >>> > >>> I'd rather see join support personally :) > >>> > >>> Jon > >>> > >>> On Fri, May 1, 2015 at 6:38 AM graham sanderson <gra...@vast.com> > wrote: > >>> > >>>> I 100% agree with Benedict, but just to be clear about my use case > >>>> > >>>> 1) We have state of lets say real estate listings > >>>> 2) We get field level deltas for them > >>>> 3) Previously we would store the base state all the deltas in > partition > >>>> and roll them up from the beginning of time (this was a prototype and > >>> silly > >>>> since there was no expiration strategy) > >>>> 4) Preferred plan is to keep current state in a static map (i.e. one > >>> delta > >>>> field only updates one cell) - we are MVCC but in the common case the > >>>> latest version will be what we want > >>>> 5) However we require history, so we’d use the partition to keep TTL > >>>> deltas going backwards from the now state - this seems like a common > >>>> pattern people would want. Note also that sometimes we might need to > >>> apply > >>>> reverse deltas if C* is ahead of our SOLR indexes > >>>> > >>>> The static columns and the regular columns ARE completely different in > >>>> behavior/lifecycle, so I’d definitely vote for them being treated as > >>> such. > >>>> > >>>> > >>>>> On May 1, 2015, at 7:27 AM, Benedict Elliott Smith < > >>>> belliottsm...@datastax.com> wrote: > >>>>> > >>>>>> > >>>>>> How would it be different from creating an actual real extra table > >>>> instead? > >>>>> > >>>>> > >>>>> There's nothing that warrants making the codebase more complex to > >>>>>> accomplish something it already does. > >>>>> > >>>>> > >>>>> As far as I was aware, the only point of static columns was to > >> support > >>>> the > >>>>> thrift ability to mutate and read them in the same expression, with > >>>>> atomicity and isolation. As to whether or not it is more complex, I'm > >>> not > >>>>> at all convinced that it would be. We have had a lot of unexpected > >>>> special > >>>>> casing added to ensure they behave correctly (e.g. paging is broken), > >>> and > >>>>> have complicated the comparison/slice logic to accommodate them, so > >>> that > >>>> it > >>>>> is harder to reason about (and to optimise). They also have very > >>>> different > >>>>> compaction characteristics, so the complexity on the user is > >> increased > >>>>> without their necessarily realising it. All told, it introduces a lot > >>>> more > >>>>> subtlety of behaviour than there would be with a separate set of > >>>> sstables, > >>>>> or perhaps a separate file attached to each sstable. > >>>>> > >>>>> Of course, we've already implemented it as a specialisation of the > >>>>> slice/comparator, I think because it seemed like the least frictional > >>>> path > >>>>> to do so, but that doesn't mean it is the least complex. It does mean > >>>> it's > >>>>> the least work (assuming we're now on top of the bugs), which is its > >>> own > >>>>> virtue. > >>>>> > >>>>> There are some advantages to having them managed separately, and > >>>> advantages > >>>>> to having them combined. Combined, for small partitions, they can be > >>> read > >>>>> in the same seek. However for large partitions this is no longer > >> true, > >>>> and > >>>>> we may behave much worse by polluting the page cache with lots of > >>>> unwanted > >>>>> data that is adjacent to the static columns. If they were managed > >>>>> separately, the page cache would be populated mostly with other > >> static > >>>>> columns, which may be more likely of use. We could quite easily have > >> a > >>>>> "static column" cache, also, and completely avoid merging them. Or at > >>>> least > >>>>> we could easily read them with collectTimeOrderedData instead of > >>>>> collectAllData semantics. > >>>>> > >>>>> All told, it certainly isn't a terrible idea, and shouldn't be > >>> dismissed > >>>> so > >>>>> readily. Personally I think in the long run whether or not we manage > >>>> static > >>>>> columns together with non-static columns is dependent on if we intend > >>> to > >>>>> add tiered "static" columns (i.e., if each level of clustering > >>> component > >>>>> can have columns associated with it). If we do, we should definitely > >>> keep > >>>>> it all inline. If not, it probably permits a lot better behaviour to > >>>>> separate them, since it's easier to reason about and improve their > >>>> distinct > >>>>> characteristics. > >>>>> > >>>>> > >>>>> On Fri, May 1, 2015 at 1:24 AM, graham sanderson <gra...@vast.com> > >>>> wrote: > >>>>> > >>>>>> Well you lose the atomicity and isolation, but in this case that is > >>>>>> probably fine > >>>>>> > >>>>>> That said, in every interaction I’ve had with static columns, they > >>> seem > >>>> to > >>>>>> be an odd duck (e.g. adding or complicating range slices), perhaps > >>>> worthy > >>>>>> of their own code path and sstables. Just food for thought. > >>>>>> > >>>>>>> On Apr 30, 2015, at 7:13 PM, Jonathan Haddad <j...@jonhaddad.com> > >>>> wrote: > >>>>>>> > >>>>>>> If you want it in a separate sstable, just use a separate table. > >>>> There's > >>>>>>> nothing that warrants making the codebase more complex to > >> accomplish > >>>>>>> something it already does. > >>>>>>> > >>>>>>> On Thu, Apr 30, 2015 at 5:07 PM graham sanderson <gra...@vast.com> > >>>>>> wrote: > >>>>>>> > >>>>>>>> Anyone here have an opinion; how realistic would it be to have a > >>>>>> separate > >>>>>>>> memtable/sstable for static columns? > >>>>>>>> > >>>>>>>> Begin forwarded message: > >>>>>>>> > >>>>>>>> *From: *Jonathan Haddad <j...@jonhaddad.com> > >>>>>>>> *Subject: **Re: DateTieredCompactionStrategy and static columns* > >>>>>>>> *Date: *April 30, 2015 at 3:55:46 PM CDT > >>>>>>>> *To: *u...@cassandra.apache.org > >>>>>>>> *Reply-To: *u...@cassandra.apache.org > >>>>>>>> > >>>>>>>> > >>>>>>>> I suspect this will kill the benefit of DTCS, but haven't tested > >> it > >>> to > >>>>>> be > >>>>>>>> 100% here. > >>>>>>>> > >>>>>>>> The benefit of DTCS is that sstables are selected for compaction > >>> based > >>>>>> on > >>>>>>>> the age of the data, not their size. When you mix TTL'ed data and > >>> non > >>>>>>>> TTL'ed data, you end up screwing with the "drop the entire > >> SSTable" > >>>>>>>> optimization. I don't believe this is any different just because > >>>> you're > >>>>>>>> mixing in static columns. What I think will happen is you'll end > >> up > >>>>>> with > >>>>>>>> an sstable that's almost entirely TTL'ed with a few static columns > >>>> that > >>>>>>>> will never get compacted or dropped. Pretty much the worst > >>> scenario I > >>>>>> can > >>>>>>>> think of. > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> On Thu, Apr 30, 2015 at 11:21 AM graham sanderson < > >> gra...@vast.com> > >>>>>> wrote: > >>>>>>>> > >>>>>>>>> I have a potential use case I haven’t had a chance to prototype > >>> yet, > >>>>>>>>> which would normally be a good candidate for DTCS (i.e. data > >>>> delivered > >>>>>> in > >>>>>>>>> order and a fixed TTL), however with every write we’d also be > >>>> updating > >>>>>> some > >>>>>>>>> static cells (namely a few key/values in a static map<text.text> > >>> CQL > >>>>>>>>> column). There could also be explicit deletes of keys in the > >> static > >>>>>> map, > >>>>>>>>> though that’s not 100% necessary. > >>>>>>>>> > >>>>>>>>> Since those columns don’t have TTL, without reading thru the code > >>>> code > >>>>>>>>> and/or trying it, I have no idea what effect this has on DTCS > >>>> (perhaps > >>>>>> it > >>>>>>>>> needs to use separate sstables for static columns). Has anyone > >>> tried > >>>>>> this. > >>>>>>>>> If not I eventually will and will report back. > >>>>>>>> > >>>>>>>> > >>>>>> > >>>>>> > >>>> > >>>> > >>> > >> > >> > >> > >> -- > >> Jonathan Ellis > >> Project Chair, Apache Cassandra > >> co-founder, http://www.datastax.com > >> @spyced > >> > >