Also, one point of early confusion for me is there is a slightly different definition of "atomicity" depending on if your talking software vs. database, and I'm a "software guy". From wikipedia:
Software = Atomicity is a guarantee of isolation from concurrent processes. Additionally, atomic operations commonly have a succeed-or-fail definition — they either successfully change the state of the system, or have no visible effect. Database = In an atomic transaction, a series of database operations either all occur, or nothing occurs. I believe that cassandra is using the database definition. will On Fri, Jul 8, 2011 at 10:35 AM, William Oberman <ober...@civicscience.com>wrote: > I think you need to look into Zookeeper, or other distributed coordinator, > as you have little/no guarantees from cassandra between 1-3 (in terms of the > guarantees you want and need). > > And my terminology in my post is different than yours. My "client" == your > "server". Specifically, I was thinking in terms of: > user -> cassandra client code (that runs on a "server") -> cassandra server > code (e.g. cassandra itself) that runs either on the same or different > server > > > > On Fri, Jul 8, 2011 at 10:22 AM, Jeffrey Kesselman <jef...@gmail.com>wrote: > >> Not quite, its more limited and specific.... >> >> The order of operations is all within the Cassandra node server and looks >> like this this... >> >> We have one row, A. Thats the only row being operated on. >> >> Client -> submits A' >> Server does the following: >> (1) Validate function reads current A >> (2) Validate function validates A' vs. A >> (3) If validation succeeds, allows update to A'. >> >> My fear/concern is that after 1 and before 3, a second update to A'' comes >> in and changes the "current" value of A, therefor invalidating my >> validation check, see? >> >> If Cassandra does not guard against this then one possible >> solution would be to make my own key-to-mutex map in memory, lock the mutex >> for A's key as a precursor to (1) and release it in a post-update function. >> But I am always very nervous about inserting locking into a process that >> wasn't designed with it already in mind... >> >> >> On Fri, Jul 8, 2011 at 8:30 AM, William Oberman <ober...@civicscience.com >> > wrote: >> >>> Questions like this seem to come up a lot: >>> >>> http://stackoverflow.com/questions/6033888/cassandra-atomicity-isolation-of-column-updates-on-a-single-row-on-on-single-no >>> >>> http://stackoverflow.com/questions/2055037/cassandra-atomic-reads-writes-within-a-single-columnfamily >>> http://www.mail-archive.com/user@cassandra.apache.org/msg14701.html >>> >>> Lets say you read state A (from one key in one CF), you change the data >>> to A' in your client, and you write A'. Are you worried that someone else >>> might have changed A to B during this process (making the "new" state a race >>> between A' and B)? It doesn't sound to me like you are... It sounds to me >>> like you're worried about a set of columns for the key being in a consistent >>> state before, during, and after a process. And A -> A' and A -> B will each >>> be atomic for the key (based on my understanding). But, if A' and B are >>> changes to a different set of columns, I believe that would interleave, >>> which itself could be "inconsistent" from your application's point of view. >>> >>> >>> will >>> >>> On Thu, Jul 7, 2011 at 11:41 PM, Jeffrey Kesselman <jef...@gmail.com>wrote: >>> >>>> Really, as i lay in the bath thinking nabout it, I concluded what I am >>>> looking for is a very limited form of Consistency. >>>> >>>> Its consistency over a single row on a single node just for the period >>>> of update. >>>> >>>> >>>> On Thu, Jul 7, 2011 at 10:34 PM, Jeffrey Kesselman <jef...@gmail.com>wrote: >>>> >>>>> Its not really isolation, btw, because we >>>>> arent talking about anyone seeing an update mid-update. Rather, we >>>>> are talking about when updates are allowed to occur. >>>>> >>>>> Atomicity means that all the updates happen together or they don't >>>>> happen at all. >>>>> Isolation means that no results of the update are visible until the >>>>> entire update operation is complete. >>>>> >>>>> This really lies somewhere in the middle of the two concepts. Its >>>>> part of the results of the combined effects of ACID >>>>> >>>>> >>>>> On Thu, Jul 7, 2011 at 10:27 PM, Jonathan Ellis <jbel...@gmail.com>wrote: >>>>> >>>>>> Sounds to me like you're confusing atomicity with isolation. >>>>>> >>>>>> On Thu, Jul 7, 2011 at 2:54 PM, Jeffrey Kesselman <jef...@gmail.com> >>>>>> wrote: >>>>>> > Yup, im even more confused. Lets talk about the model, not the >>>>>> > implementation. >>>>>> > AIUI updates to a row are atomic across all columns in that row at >>>>>> once, >>>>>> > true? >>>>>> > If true then the next question is, does the validation happen inside >>>>>> or >>>>>> > outside of that guarantee, and is the row guaranteed not to change >>>>>> between >>>>>> > validation and update? >>>>>> > If that is *not* the case then it makes a whole class >>>>>> of solutions to >>>>>> > synchronization problems fail and puts my larger project >>>>>> > in serious question. >>>>>> > >>>>>> > On Thu, Jul 7, 2011 at 3:43 PM, Yang <teddyyyy...@gmail.com> wrote: >>>>>> >> >>>>>> >> no , the memtable is a concurrentskiplistmap >>>>>> >> >>>>>> >> insertion can happen in parallel >>>>>> >> >>>>>> >> On Jul 7, 2011 9:24 AM, "Jeffrey Kesselman" <jef...@gmail.com> >>>>>> wrote: >>>>>> >> > This has me more confused. >>>>>> >> > >>>>>> >> > Does this mean that ALL rows on a given node are only updated >>>>>> >> > sequentially, >>>>>> >> > never in parallel? >>>>>> >> > >>>>>> >> > On Thu, Jul 7, 2011 at 3:21 PM, Yang <teddyyyy...@gmail.com> >>>>>> wrote: >>>>>> >> > >>>>>> >> >> just to add onto what jonathan said >>>>>> >> >> >>>>>> >> >> the columns are immutable . if u overwrite/ reconcile a new obj >>>>>> is >>>>>> >> >> created and shoved into the memtable >>>>>> >> >> >>>>>> >> >> there is a shared lock for all writes though which guard against >>>>>> an >>>>>> >> >> exclusive lock on memtable switching/flushing >>>>>> >> >> On Jul 7, 2011 7:51 AM, "A J" <s5a...@gmail.com> wrote: >>>>>> >> >> > Does a write lock: >>>>>> >> >> > 1. Just the columns in question for the specific row in >>>>>> question ? >>>>>> >> >> > 2. The full row in question ? >>>>>> >> >> > 3. The full CF ? >>>>>> >> >> > >>>>>> >> >> > I doubt read does any locks. >>>>>> >> >> > >>>>>> >> >> > Thanks. >>>>>> >> >> >>>>>> >> > >>>>>> >> > >>>>>> >> > >>>>>> >> > -- >>>>>> >> > It's always darkest just before you are eaten by a grue. >>>>>> > >>>>>> > >>>>>> > >>>>>> > -- >>>>>> > It's always darkest just before you are eaten by a grue. >>>>>> > >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Jonathan Ellis >>>>>> Project Chair, Apache Cassandra >>>>>> co-founder of DataStax, the source for professional Cassandra support >>>>>> http://www.datastax.com >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> It's always darkest just before you are eaten by a grue. >>>>> >>>> >>>> >>>> >>>> -- >>>> It's always darkest just before you are eaten by a grue. >>>> >>> >>> >>> >>> >> >> >> -- >> It's always darkest just before you are eaten by a grue. >> > >