Final reason for problem: We'd had one node's config for rpc type changed from sync to hsha...
So that mismatch can break rpc across the cluster, apparently. It would be nice if there was a good way to set that in a single spot for the cluster or handle the mismatch differently. Otherwise, if you wanted to change from sync to hsha in a cluster you'd have to entirely restart the cluster (not a big deal), but CQL would apparently not work at all until all of your nodes had been restarted. On Fri, Mar 29, 2013 at 10:35 AM, David McNelis <dmcne...@gmail.com> wrote: > Appears that restarting a node makes CQL available on that node again, but > only that node. > > Looks like I'll be doing a rolling restart. > > > On Fri, Mar 29, 2013 at 10:26 AM, David McNelis <dmcne...@gmail.com>wrote: > >> I'm running 1.2.3 and have both CQL3 tabels and old school style CFs in >> my cluster. >> >> I'd had a large insert job running the last several days which just >> ended.... it had been inserting using cql3 insert statements in a cql3 >> table. >> >> Now, I show no compactions going on in my cluster but for some reason any >> cql3 query I try to execute, insert, select, through cqlsh or through >> external library, all time out with an rpc_timeout. >> >> If I use cassandra-cli, I can do "list tablename limit 10" and >> immediately get my 10 rows back. >> >> However, if I do "select * from tablename limit 10" I get the rpc timeout >> error. Same table, same server. It doesn't seem to matter if I'm hitting >> a cql3 definited table or older style. >> >> Load on the nodes is relatively low at the moment. >> >> Any suggestions short of restarting nodes? This is a pretty major issue >> for us right now. >> > >