Hi, My unit tests started failing once I upgraded from a single node cassandra cluster to a full "N" node cluster (I'm starting with 4). I had a few various bugs, mostly due to forgetting to read/write at a quorum level in places I needed stronger consistency guarantees. But, I kept getting random, intermittent failure (the worst kind). I'm 99% sure I see why, after some painful debugging, but I don't know what to do about it. The basic flaw in my understanding of cassandra seems to boil down to: I thought system mutations of keyspaces/column families where of a stronger consistency than ONE, but that appears to not be true. Any way for me to update a cluster at something more like QUORUM?
The basic idea is in my unit test.setup() I clone my real keyspace as keyspace_UUID (with all of the exact same CFs) to get a fresh space to play in. In a single node environment, no issues. But, in a cluster, it seems that it takes a while for the system_add_keyspace call to propagate. No worries I think, I just modify my setup() to do describe_keyspace(keyspace_UUID) in a while loop until the cluster is ready. My random failures drop considerably, but every once and awhile I see a similar kind of failure. Then I find out that schema updates seem to propagate on a per node basis. At least, that's what I have to assume as I'm using phpcassa which uses a connection pool, and I see in my logging that my setup() succeeds because one connection in the pool sees the new keyspace, but when my tests run I grab a connection from the pool that is missing it! Do I have a solution other than changing my setup yet again to loop over all cassandra servers doing a describe_keyspace()? -- Will Oberman Civic Science, Inc. 3030 Penn Avenue., First Floor Pittsburgh, PA 15201 (M) 412-480-7835 (E) ober...@civicscience.com