Re: Why do Digest Queries return hash instead of timestamp?

David Boxenhorn Wed, 13 Jul 2011 08:54:47 -0700

Got it.

Thanks!


On Wed, Jul 13, 2011 at 6:05 PM, Jonathan Ellis <jbel...@gmail.com> wrote:

> (1) the hash calculation is a small amount of CPU -- MD5 is
> specifically designed to be efficient in this kind of situation
> (2) we compute one hash per query, so for multiple columns the
> advantage over timestamp-per-column gets large quickly.
>
> On Wed, Jul 13, 2011 at 7:31 AM, David Boxenhorn <da...@citypath.com>
> wrote:
> > Is that the actual reason?
> >
> > This seems like a big inefficiency to me. For those of us who don't worry
> > about this extreme edge case (that probably will NEVER happen in real
> life,
> > for most applications), is there a way to turn this off?
> >
> > Or am I wrong about this making the operation MUCH more expensive?
> >
> >
> > On Wed, Jul 13, 2011 at 3:20 PM, Boris Yen <yulin...@gmail.com> wrote:
> >>
> >> For a specific column, If there are two versions with the same
> timestamp,
> >> the value of the column is used to break the tie.
> >> if v1.value().compareTo(v2.value()) < 0, it means that v2 wins.
> >> On Wed, Jul 13, 2011 at 7:13 PM, David Boxenhorn <da...@citypath.com>
> >> wrote:
> >>>
> >>> How would you know which data is correct, if they both have the same
> >>> timestamp?
> >>>
> >>> On Wed, Jul 13, 2011 at 12:40 PM, Boris Yen <yulin...@gmail.com>
> wrote:
> >>>>
> >>>> I can only say, "data" does matter, that is why the developers use
> hash
> >>>> instead of timestamp. If hash value comes from other node is not a
> match, a
> >>>> read repair would perform. so that correct data can be returned.
> >>>>
> >>>> On Wed, Jul 13, 2011 at 5:08 PM, David Boxenhorn <da...@citypath.com>
> >>>> wrote:
> >>>>>
> >>>>> If you have to pieces of data that are different but have the same
> >>>>> timestamp, how can you resolve consistency?
> >>>>>
> >>>>> This is a pathological situation to begin with, why should you waste
> >>>>> effort to (not) solve it?
> >>>>>
> >>>>> On Wed, Jul 13, 2011 at 12:05 PM, Boris Yen <yulin...@gmail.com>
> wrote:
> >>>>>>
> >>>>>> I guess it is because the timestamp does not guarantee data
> >>>>>> consistency, but hash does.
> >>>>>> Boris
> >>>>>>
> >>>>>> On Wed, Jul 13, 2011 at 4:27 PM, David Boxenhorn <
> da...@citypath.com>
> >>>>>> wrote:
> >>>>>>>
> >>>>>>> I just saw this
> >>>>>>>
> >>>>>>> http://wiki.apache.org/cassandra/DigestQueries
> >>>>>>>
> >>>>>>> and I was wondering why it returns a hash of the data. Wouldn't it
> be
> >>>>>>> better and easier to return the timestamp? You don't really care
> what the
> >>>>>>> data is, you only care whether it is more or less recent than
> another piece
> >>>>>>> of data.
> >>>>>>
> >>>>>
> >>>>
> >>>
> >>
> >
> >
>
>
>
> --
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder of DataStax, the source for professional Cassandra support
> http://www.datastax.com
>

Re: Why do Digest Queries return hash instead of timestamp?

Reply via email to