Hi Pavel,

I'm not an expert with the pool size, but depending on your average key size and nodes you could tune it to your needs, regarding the client, a single shared client instance will suffice, there is a retrier parameter which says how many times Riak will retry your operation before returning you an exception (3 by default), and there is a timeout on acquiring the connection, this is an example config:

The pool size here is for 4 nodes cluster kind of guessing for Erlang 8 threads per node to allow Riak nodes do other things too, remember they have to sync their data between the nodes:

*host* = your balancer host
*port* = your balancer port

*final PBClientConfig clientConfig=new PBClientConfig.Builder().withHost(host).withPort(port).withPoolSize(32).withConnectionTimeoutMillis(5000).build();**
final IRiakClient riakClient=RiakFactory.newClient(clientConfig);
*
That we have it running with no issues, the pool size depends on your needs and data size, you could run with a pool size of 50 to a 100 if your keys are really small, you will have to try your own values.

Regards,

Guido.

On 11/10/12 08:40, Pavel Kogan wrote:
Thanks Guido, Pawel,

I will try using HAProxy + holding N concurrent connections on the client side.
I want clear for myself some point about concurrent connections:
1) What is reasonable limit of concurrent connections?
2) Concurrent connections = separate generated pbc clients or single shared pbc client?
3) Will connection timeout if no requests would be done for some period?

Pavel

On Wed, Oct 10, 2012 at 8:57 PM, Guido Medina <guido.med...@temetra.com <mailto:guido.med...@temetra.com>> wrote:

    From that perspective, for now it is better to treat the client as
    you would treat a JDBC DataSource pool, the tidy up comes when
    connecting the client, either one node or many, the client will
    behave better if it has no knowledge of whats going on at the
    cluster side, of course, that's as of 1.0.6, so that might change.

    He could try to connect to one node with a pool from 8 to 16
    concurrent connections and start from there, then, when talking to
    a cluster, he needs the balancer in the middle, main reason is
    because Riak expect you to connect to all nodes (it will simply
    behave better), otherwise it will be overloaded at one node and
    give you IOExceptions from time to time.

    Hope that helps,

    Guido.


    On 10/10/12 19:24, kamiseq wrote:

        ok, you have 100% point here, on the other hand I think pavel
        looks
        for some guidance how to improve performance on client side,
        so he can
        be 100% sure he is not wasting time on something. this is maybe
        premature optimization but it maybe also good position to
        understand
        library and enter new world of riak

        pozdrawiam
        Paweł Kamiński

        kami...@gmail.com <mailto:kami...@gmail.com>
        pkaminski....@gmail.com <mailto:pkaminski....@gmail.com>
        ______________________


        On 10 October 2012 17:30, Guido Medina
        <guido.med...@temetra.com <mailto:guido.med...@temetra.com>>
        wrote:

            In fact, as more nodes, you might be surprised it that it
            might be
            faster....see my point? Riak is a lot of things, 1st you
            have to be aware of
            the hashing, hashmap, how a key gets copied into different
            nodes, how one or
            more nodes are responsible for a key, etc...so it is not
            that simple.


            On 10/10/12 16:28, Guido Medina wrote:

            That's why I keep pushing to one answer, Riak is not meant
            to be in one
            cluster, you are removing the external factors and CAP
            settings you will be
            using, and it won't be linear, you could get the same
            results with RW=2 with
            3, 4 and 5 nodes, there are several factors that will
            influence your
            benchmark, I would start with 3 nodes, up to 5 by altering
            those numbers,
            then you could end up with a formula which I asure you, it
            won't be linear.

            Regards,

            Guido.

            On 10/10/12 16:19, Pavel Kogan wrote:

            I understand that load balancing is a final solution, but
            I want to
            benchmark single node.
            If I knew that I can load single node with N requests /
            sec, I could assume
            that after load balancing over 5 nodes my throughput limit
            will increase
            linearly.

            Pavel

            On Wed, Oct 10, 2012 at 2:51 PM, Guido Medina
            <guido.med...@temetra.com <mailto:guido.med...@temetra.com>>
            wrote:

                The answer is there, create a client config with N
                pooled connections to
                your load balancer whatever you are using, I know HA
                proxy supports the PBC
                config (TCP based) which is faster than HTTP client,
                and hence my
                recommendation.

                Say, a non-clustered client config with N connections
                to balancer_host at
                8087 and your balancer_host connected to EACH node,
                that's the way to go,
                the rest is about the CAP level you want to support
                which will impact your
                performance vs integrity. Up to you.

                CAP doc:
                
http://docs.basho.com/riak/latest/tutorials/fast-track/Tunable-CAP-Controls-in-Riak/

                Guido.


                On 10/10/12 13:33, Pavel Kogan wrote:

                Hi,

                The node is OK and not down.
                I have a way to do load balancing externally to JAVA
                Client.
                I am evaluating Riak for using in my company and want
                to measure maximal
                throughput vs single node.

                Thanks,
                    Pavel

                On Wed, Oct 10, 2012 at 2:13 PM, Guido Medina
                <guido.med...@temetra.com
                <mailto:guido.med...@temetra.com>>
                wrote:

                    That question has been answered few times, here is
                    my old answer:

                    Hi,

                       It is the Java client which to be honest,
                    doesn't handle well one node
                    going down, so, for example, in my company we use
                    HA proxy for that, here
                    is
                    a starting configuration:
                    https://gist.github.com/1507077

                       Once we switched to HA proxy we just use a
                    simple client without
                    cluster
                    config, so the Java client doesn't know anything
                    about the load balancing
                    going on. It works well, I can upgrade and restart
                    servers without our
                    Java
                    application be complaining.

                    Regards,

                    Guido.


                    On 10/10/12 12:58, Pavel Kogan wrote:

                    Thanks,

                    I will try this solution.

                    Pavel

                    On Wed, Oct 10, 2012 at 1:51 PM, kamiseq
                    <kami...@gmail.com <mailto:kami...@gmail.com>> wrote:

                        well I asked same question few days ago (maybe
                        2 weeks form now) and
                        the answer was that yes sharing client is
                        thread safe and all you
                        should do is to create new bucket instance on
                        every request

                        pozdrawiam
                        Paweł Kamiński

                        kami...@gmail.com <mailto:kami...@gmail.com>
                        pkaminski....@gmail.com
                        <mailto:pkaminski....@gmail.com>
                        ______________________


                        On 10 October 2012 09:25, Pavel Kogan
                        <pavel.ko...@cortica.com
                        <mailto:pavel.ko...@cortica.com>> wrote:

                            1) Is it ok to share a single pbc client
                            object between 50 threads?
                            Should
                            it be protected by lock ?
                            2) I didn't do load balancing between
                            nodes yet, cause I want to
                            understand
                            better throughput limit. I am planning to
                            do it for much higher
                            throughput.

                            Pavel


                            On Wed, Oct 10, 2012 at 9:21 AM, kamiseq
                            <kami...@gmail.com
                            <mailto:kami...@gmail.com>> wrote:

                                maybe the good start is to share
                                pbclient object and only create
                                bucket per request, you will save few
                                steps on client configuration.
                                have you tried balancing requests to
                                cluster and distribute them over
                                all
                                nodes?

                                pozdrawiam
                                Paweł Kamiński

                                kami...@gmail.com
                                <mailto:kami...@gmail.com>
                                pkaminski....@gmail.com
                                <mailto:pkaminski....@gmail.com>
                                ______________________


                                On 10 October 2012 06:18, Pavel Kogan
                                <pavel.ko...@cortica.com
                                <mailto:pavel.ko...@cortica.com>>
                                wrote:

                                    Hi all,

                                    I have Riak cluster consisting of
                                    5 nodes that contains about 30
                                    millions of
                                    keys (35% of capacity according to
                                    Riak Control).
                                    Currently we have single JAVA
                                    client reading and writing records to
                                    same
                                    node. I need some tips, how to use
                                    the client efficiently
                                    to reach maximal throughput - I
                                    would like to be able to read/write
                                    up
                                    to
                                    100 records/sec on 1Gbit network.
                                    Currently I get a lot
                                    of JAVA socket exceptions after a
                                    while (even for the much slower
                                    rate -
                                    10
                                    records/sec), after which I  need
                                    to restart client and node.

                                    Thanks,
                                        Pavel

                                    P.S: My client using 50 threads
                                    and pbc client is created and
                                    shut-downed
                                    per request.

                                    
_______________________________________________
                                    riak-users mailing list
                                    riak-users@lists.basho.com
                                    <mailto:riak-users@lists.basho.com>
                                    
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com





                    _______________________________________________
                    riak-users mailing list
                    riak-users@lists.basho.com
                    <mailto:riak-users@lists.basho.com>
                    
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com



                    _______________________________________________
                    riak-users mailing list
                    riak-users@lists.basho.com
                    <mailto:riak-users@lists.basho.com>
                    
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com



                _______________________________________________
                riak-users mailing list
                riak-users@lists.basho.com
                <mailto:riak-users@lists.basho.com>
                
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com




            _______________________________________________
            riak-users mailing list
            riak-users@lists.basho.com <mailto:riak-users@lists.basho.com>
            http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com



    _______________________________________________
    riak-users mailing list
    riak-users@lists.basho.com <mailto:riak-users@lists.basho.com>
    http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com



_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to