Re: New Chain for : Does Cassandra use vector clocks

2011-02-24 Thread Dave Revell
>Time stamps are not used for conflict resolution - unless is is part of the application logic!!! This is false. In fact, the main reason Cassandra keeps timestamps is to do conflict resolution. If there is a conflict between two replicas, when doing a read or a repair, then the highest timestamp

Re: How does Cassandra handle failure during synchronous writes

2011-02-23 Thread Dave Revell
Ritesh, You have seen the problem. Clients may read the newly written value even though the client performing the write saw it as a failure. When the client reads, it will use the correct number of replicas for the chosen CL, then return the newest value seen at any replica. This "newest value" co

Re: How does Cassandra handle failure during synchronous writes

2011-02-22 Thread Dave Revell
Ritesh, There is no commit protocol. Writes may be persisted on some replicas even though the quorum fails. Here's a sequence of events that shows the "problem:" 1. Some replica R fails, but recently, so its failure has not yet been detected 2. A client writes with consistency > 1 3. The write go

Re: Patterns for writing enterprise applications on cassandra

2011-02-16 Thread Dave Revell
Re Anthony's statement: > So it can be done and frameworks like CAGES are showing a way forward. At > the heart of it, there will need to be a Two-Phase commit type protocol > coordinator that sits in front of Cassandra. Of which - one can be sure - there > will be many implementations / best prac

Re: Patterns for writing enterprise applications on cassandra

2011-02-16 Thread Dave Revell
Ritesh, There don't seem to be any common best practices to do this. I think the reason is that by adding transaction semantics on top of Cassandra you're throwing away the most important properties of Cassandra. The effects of a transaction/locking layer: - A centralized performance bottleneck t

Re: Indexes and hard disk

2011-02-12 Thread Dave Revell
Indexes have another important advantage over multiple denormalized column families. If you make the copies yourself, eventually the copies will diverge from the base "true" column family due to routine occasional failures. You'll probably want to find and fix these inconsistencies. If you're usin

Re: Can serialized objects in columns serve as ersatz superCFs?

2011-02-08 Thread Dave Revell
Yes, this works well for me. I have no SCFs but many columns contain JSON. Depending on your time/space/compatibility tradeoffs you can obviously pick you own serialization method. Best, Dave On Feb 8, 2011 4:33 AM, "buddhasystem" wrote: > > Seeing that discussion here about indexes not supporte

Re: TSocket timing out

2011-01-30 Thread Dave Revell
Under heavy load, this could be the result of the server not accept()ing fast enough, causing the number of pending connections to exceed the listen backlog size in the kernel. I believe Cassandra uses the default of 50 backlogged connections. This is one of the reasons why a persistent connectio