Re: speeding up bulk loading

Guido Medina Tue, 06 Sep 2016 09:04:28 -0700

Hi Travis,

I have done similar things using the Java client but I will assume youhave access to change certain settings at the C client, assuming youhave RW = 2 and N =3, your client is returning to you once 2 writes aremade but an asynchronous write is still pending which will eventuallycreate lots of back pressure and overload your Riak nodes.

Concurrently speaking don't use more than 8 writer threads for this andset RW = N (in your case 3) for this tasks so that each write blocks,that way you won't hit a high back pressure.

Have some sort of blocking queue where you can only put up to (as anexample) 500 tasks, 4 to 8 writer threads consuming from such queue, sohere is a brief description of what I recommend:


 * RW = N (in this case 3) so that your write operation blocks and
   don't leave asynchronous tasks pending for the Riak cluster.
 * Don't exceed say, 8 writers (I have been there, more threads won't
   really help, sometimes 4 will just write faster than 8)
 * Have a bounded blocking queue (example max size = 500) where you
   schedule your tasks which the writer threads consume from.

Try that and see if that helps, I'm quite certain that at using suchmechanism it will be consistent and your Riak nodes should never crash.


Best regards,

Guido.

On 06/09/16 16:42, Travis Kirstine wrote:


Thank Alexander, we’re using HAproxy

*From:*Alexander Sicular [mailto:sicul...@basho.com]
*Sent:* September-06-16 11:09 AM
*To:* Travis Kirstine <tkirst...@firstbasesolutions.com>
*Cc:* Magnus Kessler <mkess...@basho.com>; riak-users@lists.basho.com
*Subject:* Re: speeding up bulk loading

Hi Travis,

I also want to confirm that you are spreading your load amongst allnodes in the cluster. You should be connecting your C client to Riakvia a proxy like nginx/HAproxy/F5 [0]. The proxy will do a roundrobin/least connections distribution to all nodes in the cluster. Thiswill greatly increase performance if you are not already doing it.


-alexander

[0] http://docs.basho.com/riak/kv/2.1.4/configuring/load-balancing-proxy/




Alexander Sicular

Solutions Architect

Basho Technologies
9175130679

@siculars

On Wed, Aug 31, 2016 at 10:41 AM, Travis Kirstine<tkirst...@firstbasesolutions.com<mailto:tkirst...@firstbasesolutions.com>> wrote:


    Magnus

    Thanks for your reply.  We’re are using the riack C client library
    for riak (https://github.com/trifork/riack) which is used within
    an application called MapCache to store 256x256 px images with a
    corresponding key within riak.  Currently we have 75 million
    images to transfer from disk into riak which is being done
    concurrently.  Periodically this transfer process will crash

    Riak is setup using n=3 on 5 nodes with a leveldb backend.  Each
    server has 45GB of memory and 16 cores with  standard hard
    drives.  We made no significant modification to the riak.conf
    except upping the leveldb.maximum_memory.percent to 70 and
    tweeking the sysctl.conf as follows

    vm.swappiness = 0

    net.ipv4.tcp_max_syn_backlog = 40000

    net.core.somaxconn = 40000

    net.core.wmem_default = 8388608

    net.core.rmem_default = 8388608

    net.ipv4.tcp_sack = 1

    net.ipv4.tcp_window_scaling = 1

    net.ipv4.tcp_fin_timeout = 15

    net.ipv4.tcp_keepalive_intvl = 30

    net.ipv4.tcp_tw_reuse = 1

    net.ipv4.tcp_moderate_rcvbuf = 1

    # Increase the open file limit

    # fs.file-max = 65536 # current setting

    I have seen this error in the logs

    2016-08-30 22:26:07.180 [error] <0.20777.512> CRASH REPORT Process
    <0.20777.512> with 0 neighbours crashed with reason: no function
    clause matching
    webmachine_request:peer_from_peername({error,enotconn},
    
{webmachine_request,{wm_reqstate,#Port<0.2817336>,[],undefined,undefined,undefined,{wm_reqdata,'GET',...},...}})
    line 150

    Regards

    *From:*Magnus Kessler [mailto:mkess...@basho.com
    <mailto:mkess...@basho.com>]
    *Sent:* August-31-16 4:08 AM
    *To:* Travis Kirstine <tkirst...@firstbasesolutions.com
    <mailto:tkirst...@firstbasesolutions.com>>
    *Cc:* riak-users@lists.basho.com <mailto:riak-users@lists.basho.com>
    *Subject:* Re: speeding up bulk loading

    On 26 August 2016 at 22:20, Travis Kirstine
    <tkirst...@firstbasesolutions.com
    <mailto:tkirst...@firstbasesolutions.com>> wrote:

        Is there any way to speed up bulk loading?  I wondering if I
        should be tweeking the erlang, aae or other config options?

    Hi Travis,

    Excuse the late reply; your message had been stuck in the
    moderation queue. Please consider subscribing to this list.

    Without knowing more about how you perform bulk uploads, it's
    difficult to recommend any changes. Are you using the HTTP REST
    API or one of the client libraries, which use protocol buffers by
    default? What concerns do you have about the upload performance?
    Please let us know a bit more about your setup.

    Kind Regards,

    Magnus

--

    Magnus Kessler

    Client Services Engineer

    Basho Technologies Limited

    Registered Office - 8 Lincoln’s Inn Fields London WC2A 3BP Reg
    07970431


    _______________________________________________
    riak-users mailing list
    riak-users@lists.basho.com <mailto:riak-users@lists.basho.com>
    http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com



_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: speeding up bulk loading

Reply via email to