I had rolled out an upgrade to a JVM app that uses rjc 1.0.5. We had upgraded to 1.0.6 to take advantage of newly added abilities to do a put without preceding it with a fetch in order to reduce operational load on the cluster. However, after rolling out this change we frequently see large rises in latency across the cluster (up to the gen_fsm limit of 60s) and see the following in the riak logs
[error] Unrecognized message {74392380,{error,timeout}} This is accompanied by repeated socket timeouts as seen by the riak-java-client. Also worth mentioning, one of our nodes got into a state that the rjc was unable to establish a tcp connection on the protobuf port to riak on localhost. We were only able to fix this by restarting the riak process on that node and inducing a fair amount of handoff. Any thoughts? Thanks, D _______________________________________________ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com