i will continue the issue here: http://groups.google.com/group/scale7/browse_thread/thread/dd74f1d6265ae2e7
thanks On Tue, Feb 8, 2011 at 7:44 AM, Dan Washusen <d...@reactive.org> wrote: > Hi, > I've added some comments and questions inline. > > Cheers, > Dan > On 8 February 2011 10:00, Jonathan Ellis <jbel...@gmail.com> wrote: >> >> On Mon, Feb 7, 2011 at 1:51 AM, TSANG Yiu Wing <ywts...@gmail.com> wrote: >> > cassandra version: 0.7 >> > >> > client library: scale7-pelops / 1.0-RC1-0.7.0-SNAPSHOT >> > >> > cluster: 3 machines (A, B, C) >> > >> > details: >> > it works perfectly when all 3 machines are up and running >> > >> > but if the seed machine is down, the problems happen: >> > >> > 1) new client connection cannot be established >> >> sounds like pelops relies on the seed node to introduce it to the >> cluster. you should configure it either with a hardcoded list of >> nodes or use something like RRDNS instead. I don't use pelops so I >> can't help other than that. (I believe there is a mailing list for >> Pelops though.) > > When dynamic node discovery is turned on (off by default) it doesn't > (shouldn't) rely on the initial seed node once past initialization. So > either make sure you have dynamic node discovery turned on or seed Pelops > with all nodes in your cluster... > It would be helpful if you provided more information about the errors you're > seeing preferably with debug level logging turned on. > >> >> > 2) if a client keeps connecting to and operating at (issue get and >> > update) the cluster, when the seed is down, the working client will >> > throw exception upon the next operation >> >> I know Hector supports transparent failover to another Cassandra node. >> Perhaps Pelops does not. > > Pelops will validate connections at a configurable period (60 seconds by > default) and remove them from the pool. Pelops will also retry the > operation three times (configurable) against a different node in the pool > each time. > If you want Pelops to take more agressive actions when it detects downed > nodes then check out > org.scale7.cassandra.pelops.pool.CommonsBackedPool.INodeSuspensionStrategy. > >> >> > 3) using cassandra-cli to connect the remaining nodes in the cluster, >> > "Internal error processing get_range_slices" will happen when querying >> > column family >> >> list <cf>; >> >> Cassandra always logs the cause of internal errors in system.log, so >> you should look there. >> >> -- >> Jonathan Ellis >> Project Chair, Apache Cassandra >> co-founder of DataStax, the source for professional Cassandra support >> http://www.datastax.com > >