I had rolled out an upgrade to a JVM app that uses rjc 1.0.5. We had
upgraded to 1.0.6 to take advantage of newly added abilities to do a
put without preceding it with a fetch in order to reduce operational
load on the cluster. However, after rolling out this change we
frequently see large rises in latency across the cluster (up to the
gen_fsm limit of 60s) and see the following in the riak logs

[error] Unrecognized message {74392380,{error,timeout}}

This is accompanied by repeated socket timeouts as seen by the riak-java-client.

Also worth mentioning, one of our nodes got into a state that the rjc
was unable to establish a tcp connection on the protobuf port to riak
on localhost. We were only able to fix this by restarting the riak
process on that node and inducing a fair amount of handoff.

Any thoughts?

Thanks,
D

_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to