Re: Long delays when trying to recover from errors in Java client at startup

Magnus Kessler Mon, 22 May 2017 00:15:45 -0700

On 22 May 2017 at 07:02, Toby Corkindale <t...@dryft.net> wrote:

> Hi,
> I've been trying to make a JVM-based app have better error recovery when
> the Riak cluster is still in a starting-up state.
> I have a fairly naive wait-loop that tries to connect and list buckets,
> and if there's an exception, retry again after a short delay.
>
> However once the Riak cluster comes good, the java client hangs on the
> first operation it makes, for a really long time. Minutes.
>  -- in particular, at com.basho.riak.client.core.
> RiakCluster.retryOperation(RiakCluster.java:479)
>
> I've tried shutting down and recreating the RiakClient between attempts,
> but this doesn't seem to help.
> I guess the node manager has its own back-offs and delays.. Is there a way
> to reduce these timeouts?
>
> Thanks,
> Toby
>
>
Hi Toby,


Using bucket listing as a method to determine live-ness is a really bad
idea. Bucket-listing, just as key-listing, requires a coverage query across
ALL objects stored in the cluster, and will take a really long time if the
cluster contains many objects.

A better alternative would be to have a canary object with a known key,
that can be read quickly.

In startup scripts, that need to wait until Riak KV is operational, we
recommend using `riak-admin wait-for-service riak_kv`.

Kind Regards,

Magnus

-- 
Magnus Kessler
Client Services Engineer
Basho Technologies Limited

Registered Office - 8 Lincoln’s Inn Fields London WC2A 3BP Reg 07970431

_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Long delays when trying to recover from errors in Java client at startup

Reply via email to