yeah, then maybe we can make that a silent omission. less desirable, but still better than unpredicted behavior. (this is not that bad: currently you can't know whether a write result really reached a quorum, i.e. become "effective", anyway)
regarding "we never look at SStables", I think right now counter adds do require a read on SStables, although asynchronously: StorageProxy: private static void applyCounterMutation(final IMutation mutation, final Multimap<InetAddress, InetAddress> hintedEndpoints, final IWriteResponseHandler responseHandler, final String localDataCenter, final ConsistencyLevel consistency_level, boolean executeOnMutationStage) { ...................... sendToHintedEndpoints(cm.makeReplicationMutation(), hintedEndpoints, responseHandler, localDataCenter, false, consistency_level); .... } CounterMutation.java: public RowMutation makeReplicationMutation() throws IOException { .... Table table = Table.open(readCommand.table); Row row = readCommand.getRow(table); ................ } I think the "getRow()" line above does what the .pdf design doc in the JIRA described: replication to other replicas (non-leaders) replicate only the **sum** that I own, not individual delta that I just received. actually I'm not quite understanding why this approach was chosen, since it makes each write into read---write (when getReplicateOnWrite() ) , which can be slow. I'm still trying to understand that Thanks Yang On Sun, May 29, 2011 at 3:45 AM, aaron morton <aa...@thelastpickle.com>wrote: > Without commenting on the other parts of the design, this part is not > possible "attempts to add to a dead counter throws an exception " > > All write operations are no look operations (write to the log, update > memtables) we never look at the SSTables. It goes against the architecture > of the write path to require a read from disk. > > Cheers > > ----------------- > Aaron Morton > Freelance Cassandra Developer > @aaronmorton > http://www.thelastpickle.com > > On 29 May 2011, at 20:04, Yang wrote: > > > sorry in the notation, instead of "ttl" I mean "timestamp" > > > On Sun, May 29, 2011 at 12:24 AM, Yang <teddyyyy...@gmail.com> wrote: > >> sorry to beat on the dead horse. >> >> I looked at the link referred from #2103 : >> https://issues.apache.org/jira/browse/CASSANDRA-2101 >> I agree with the reasoning in #2101 that the ultimate issue is that delete >> and counter adds are not commutative. since by definition we can't achieve >> predictable behavior with deletes + counter, can we redefine the behavior >> of counter deletes, so that we can always guarantee the declared behavior? >> --- specifically: >> >> >> *we define that once a counter column is deleted, you can never add to it >> again.* attempts to add to a dead counter throws an exception ---- all >> future adds are just ignored. i.e. a counter column has only one life, >> until all tombstones are purged from system, after which it is possible for >> the counter to have a new incarnation. basically instead of solving the >> problem raised in #2103, we declare openly that it's unsolvable (which is >> true), and make the code reflect this fact. >> >> >> >> I think this behavior would satisfy most use cases of counters. so instead >> of relying on the advice to developers: "do not do updates for a period >> after deletes, otherwise it probably wont' work", we enforce this into the >> code. >> >> >> the same logic can be carried over into expiring column, since they are >> essentially automatically inserted deletes. that way #2103 could be "solved" >> >> >> I'm attaching an example below, you can refer to them if needed. >> >> Thanks a lot >> Yang >> >> >> example: >> for simplicity we assume there is only one column family , one column, so >> we omit column name and cf name in our notation, assume all counterColumns >> have a delta value of 1, we only mark their ttl now. so c(123) means a >> counter column of ttl=1, adding a delta of 1. d(456) means a tombstone with >> ttl=456. >> >> then we can have the following operations >> >> operation result after operation >> ---------------------------------------------------------------------- >> c(1) count=1 >> d(2) count = null ( counter not present ) >> >> c(3) count = null ( add on dead counter >> ignored) >> --------------------------------------------------- >> >> >> if the 2 adds arrive out of order , we would still guarantee eventual >> consistency: >> >> operation result after operation >> >> -------------------------------------------------------------------------------- >> c(1) count=1 >> c(3) count=2 (we have 2 adds, each with >> delta=1) >> d(2) count=null (deleted) >> -------------------------------------------------------------- >> at the end of both scenarios, the result is guaranteed to be null; >> note that in the second scenario, line 2 shows a snapshot where we have a >> state with count=2, which scenario 1 never sees this. this is fine, since >> even regular columns can have this situation (just consider if the counter >> columns were inserts/overwrites instead ) >> >> >> >> On Fri, May 27, 2011 at 5:57 PM, Jonathan Ellis <jbel...@gmail.com> >> wrote: >> > No. See comments to >> https://issues.apache.org/jira/browse/CASSANDRA-2103 >> > >> > On Fri, May 27, 2011 at 7:29 PM, Yang <teddyyyy...@gmail.com> wrote: >> >> is this combination feature available , or on track ? >> >> >> >> thanks >> >> Yang >> >> >> > >> > >> > >> > -- >> > Jonathan Ellis >> > Project Chair, Apache Cassandra >> > co-founder of DataStax, the source for professional Cassandra support >> > http://www.datastax.com >> > >> >> > >