Re: yet another benchmark bottleneck

Michael Burman Mon, 12 Mar 2018 07:43:56 -0700

Although low amount of updates, it's possible that you hit a contentionbug. A simple test would be to add multiple Cassandra nodes on the samephysical node (like split your 20 cores to 5 instances of Cassandra). Ifyou get much higher throughput, then you have an answer..

I don't think a single-instance Cassandra 3.11.2 scales to 20 cores (atleast with the stress-test pattern). There's few known issues in thewrite-path at least that prevent scaling with high CPU core count.


  - Micke


On 03/12/2018 03:14 PM, onmstester onmstester wrote:

I mentioned that already tested increasing client threads + manystress-client instances in one node + two stress-client in twoseparate nodes, in all of them the sum of throughputs is less than130K. I've been tuning all aspects of OS and Cassandra (whatever I'veseen in config files!) for two days, still no luck!


Sent using Zoho Mail <https://www.zoho.com/mail/>

---- On Mon, 12 Mar 2018 16:38:22 +0330 *Jacques-Henri Berthemet<jacques-henri.berthe...@genesys.com>* wrote ----


    What happens if you increase number of client threads?

    Can you add another instance of cassandra-stress on another host?


    *--*

    *Jacques-Henri Berthemet*


    *From:* onmstester onmstester [mailto:onmstes...@zoho.com
    <mailto:onmstes...@zoho.com>]
    *Sent:* Monday, March 12, 2018 12:50 PM
    *To:* user <user@cassandra.apache.org
    <mailto:user@cassandra.apache.org>>
    *Subject:* RE: yet another benchmark bottleneck


    no luck even with 320 threads for write


    Sent using Zoho Mail <https://www.zoho.com/mail/>



    ---- On Mon, 12 Mar 2018 14:44:15 +0330 *Jacques-Henri Berthemet
    <jacques-henri.berthe...@genesys.com
    <mailto:jacques-henri.berthe...@genesys.com>>* wrote ----


        It makes more sense now, 130K is not that bad.


        According to cassandra.yaml you should be able to increase
        your number of write threads in Cassandra:

        # On the other hand, since writes are almost never IO bound,
        the ideal

        # number of "concurrent_writes" is dependent on the number of
        cores in

        # your system; (8 * number_of_cores) is a good rule of thumb.

        concurrent_reads: 32

        concurrent_writes: 32

        concurrent_counter_writes: 32


        Jumping directly to 160 would be a bit high with spinning
        disks, maybe start with 64 just to see if it gets better.


        *--*

        *Jacques-Henri Berthemet*


        *From:*onmstester onmstester [mailto:onmstes...@zoho.com
        <mailto:onmstes...@zoho.com>]

        *Sent:*Monday, March 12, 2018 12:08 PM

        *To:*user <user@cassandra.apache.org
        <mailto:user@cassandra.apache.org>>

        *Subject:*RE: yet another benchmark bottleneck


        RF=1

        No errors or warnings.

        Actually its 300 Mbit/seconds and 130K OP/seconds. I missed a
        'K' in first mail, but anyway! the point is: More than half of
        node resources (cpu, mem, disk, network) is unused and i can't
        increase write throughput.


        Sent using Zoho Mail <https://www.zoho.com/mail/>



        ---- On Mon, 12 Mar 2018 14:25:12 +0330 *Jacques-Henri
        Berthemet <jacques-henri.berthe...@genesys.com
        <mailto:jacques-henri.berthe...@genesys.com>>* wrote ----


            Any errors/warning in Cassandra logs? What’s your RF?

            Using 300MB/s of network bandwidth for only 130 op/s looks
            very high.


            *--*

            *Jacques-Henri Berthemet*


            *From:*onmstester onmstester [mailto:onmstes...@zoho.com
            <mailto:onmstes...@zoho.com>]

            *Sent:*Monday, March 12, 2018 11:38 AM

            *To:*user <user@cassandra.apache.org
            <mailto:user@cassandra.apache.org>>

            *Subject:*RE: yet another benchmark bottleneck


            1.2 TB 15K

            latency reported by stress tool is 7.6 ms. disk latency is
            2.6 ms


            Sent using Zoho Mail <https://www.zoho.com/mail/>



            ---- On Mon, 12 Mar 2018 14:02:29 +0330 *Jacques-Henri
            Berthemet <jacques-henri.berthe...@genesys.com
            <mailto:jacques-henri.berthe...@genesys.com>>* wrote ----


                What’s your disk latency? What kind of disk is it?


                *--*

                *Jacques-Henri Berthemet*


                *From:*onmstester onmstester
                [mailto:onmstes...@zoho.com <mailto:onmstes...@zoho.com>]

                *Sent:*Monday, March 12, 2018 10:48 AM

                *To:*user <user@cassandra.apache.org
                <mailto:user@cassandra.apache.org>>

                *Subject:*Re: yet another benchmark bottleneck


                Running two instance of Apache Cassandra on same
                server, each having their own commit log disk dis not
                help. Sum of cpu/ram usage for both instances would be
                less than half of all available resources. disk usage
                is less than 20% and network is still less than 300Mb
                in Rx.


                Sent using Zoho Mail <https://www.zoho.com/mail/>



                ---- On Mon, 12 Mar 2018 09:34:26 +0330 *onmstester
                onmstester <onmstes...@zoho.com
                <mailto:onmstes...@zoho.com>>* wrote ----


                    Apache-cassandra-3.11.1

                    Yes, i'm dosing a single host test


                    Sent using Zoho Mail <https://www.zoho.com/mail/>



                    ---- On Mon, 12 Mar 2018 09:24:04 +0330 *Jeff
                    Jirsa <jji...@gmail.com
                    <mailto:jji...@gmail.com>>* wrote ----




                        Would help to know your version. 130
                        ops/second sounds like a ridiculously low
                        rate. Are you doing a single host test?


                        On Sun, Mar 11, 2018 at 10:44 PM, onmstester
                        onmstester <onmstes...@zoho.com
                        <mailto:onmstes...@zoho.com>> wrote:



                            I'm going to benchmark Cassandra's write
                            throughput on a node with following spec:

                              * CPU: 20 Cores
                              * Memory: 128 GB (32 GB as Cassandra heap)
                              * Disk: 3 seprate disk for OS, data and
                                commitlog
                              * Network: 10 Gb (test it with iperf)
                              * Os: Ubuntu 16


                            Running Cassandra-stress:

                            cassandra-stress write n=1000000 -rate
                            threads=1000 -mode native cql3 -node X.X.X.X


                            from two node with same spec as above, i
                            can not get throughput more than 130 Op/s.
                            The clients are using less than 50% of
                            CPU, Cassandra node uses:

                              * 60% of cpu
                              * 30% of memory
                              * 30-40% util in iostat of commitlog
                              * 300 Mb of network bandwidth

                            I suspect the network, cause no matter how
                            many clients i run, cassandra always using
                            less than 300 Mb. I've done all the tuning
                            mentioned by datastax.

                            Increasing wmem_max and rmem_max did not
                            help either.


                            Sent using Zoho Mail
                            <https://www.zoho.com/mail/>



---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org

Re: yet another benchmark bottleneck

Reply via email to