Can you provide the python script that you're using?

(I'm moving this thread to the pycassa mailing list (
pycassa-disc...@googlegroups.com), which is a better place for this
discussion.)


On Thu, Jan 31, 2013 at 6:25 PM, Pradeep Kumar Mantha
<pradeep...@gmail.com>wrote:

> Hi,
>
> I am trying to benchmark cassandra on a 12 Data Node cluster using 16
> clients ( each client uses 32 threads) using custom pycassa client and YCSB.
>
> I found the maximum number of operations/seconds achieved using pycassa
> client is nearly 70k+ reads/second.
> Whereas with YCSB it is ~ 120k reads/second.
>
> Any thoughts, why I see this huge difference in performance?
>
>
> Here is the description of setup.
>
> Pycassa client (a simple python script).
> 1. Each pycassa client starts 4 threads - where each thread queries 76896
> queries.
> 2. a shell script is used to submit 4threads/each core using taskset unix
> command on a 8 core single node. ( 8 * 4 * 76896 queries)
> 3. Another shell script is used to scale the single node shell script to
> 16 nodes  ( total queries now - 16 * 8 * 4 * 76896 queries )
>
> I tried to keep YCSB configuration as much as similar to my custom pycassa
> benchmarking setup.
>
> YCSB -
>
> Launched 16 YCSB clients on 16 nodes where each client uses 32 threads for
> execution and need to query ( 32 * 76896 keys ), i.e 100% reads
>
> The dataset is different in each case, but has
>
> 1. same number of total records.
> 2. same number of fields.
> 3. field length is almost same.
>
> Could you please let me know, why I see this huge performance difference
> and is there any way I can improve the operations/second using pycassa
> client.
>
> thanks
> pradeep
>
>



-- 
Tyler Hobbs
DataStax <http://datastax.com/>

Reply via email to