Oh, i just saw your first mail. "I don't see a negative number in you paste?"
(03a227f0-a5c3-11e1-0000-b7f5e49dceff, 1, -1) and (03a227f0-a5c3-11e1-0000-b7f5e49dceff, 1, 1) (03a227f0-a5c3-11e1-0000-b7f5e49dceff, 4, -5000) and (03a227f0-a5c3-11e1-0000-b7f5e49dceff, 4, 20000) (03a227f0-a5c3-11e1-0000-b7f5e49dceff, 19, -3) and (03a227f0-a5c3-11e1-0000-b7f5e49dceff, 19, 19) The counts on the left parentheses are negative values and we never decrements counters. Thanks for your explanations. Alain 2012/9/20 Alain RODRIGUEZ <arodr...@gmail.com> > "I think that's inconsistent with the hypothesis that unclean shutdown is > the sole cause of these problems" > > I agree, we just never shut down any node, neither had any crash, and yet > we have these bugs. > > About your side note : > > We know about it, but we couldn't find any other way to be able to provide > real-time analytics. If you do so, we would be really glad to hear about it. > We need both to serve statistics in real-time and be accurate about > prices and we need a coherence between what's shown in our graphics and > tables and the invoices we provide to our customers. > What we do is trying to avoid timeouts as much as possible (increasing the > time before a timeout and getting a the lowest CPU load possible). In order > to keep a low latency for the user we write first the events in a queue > message (Kestrel) and then we process it with storm, which writes the > events and increments counters in Cassandra. > > Once again if you got a clue about a better way of doing this, we are > always happy to learn and try to enhance our architecture and our process. > > Alain > > > 2012/9/20 Peter Schuller <peter.schul...@infidyne.com> > >> The significance I think is: If it is indeed the case that the higher >> value is always *in fact* correct, I think that's inconsistent with >> the hypothesis that unclean shutdown is the sole cause of these >> problems - as long as the client is truly submitting non-idempotent >> counter increments without a read-before-write. >> >> As a side note: If hou're using these counters for stuff like >> determining amounts of money to be payed by somebody, consider the >> non-idempotense of counter increments. Any write that increments a >> counter, that fails by e.g. Timeout *MAY OR MAY NOT* have been applied >> and cannot be safely retried. Cassandra counters are generally not >> useful if *strict* correctness is desired, for this reason. >> >> -- >> / Peter Schuller (@scode, http://worldmodscode.wordpress.com) >> > >