On Tue, 2013-02-05 at 21:38 +1300, aaron morton wrote:
> The first thing I noticed is your script uses python threading library, which 
> is hampered by the Global Interpreter Lock 
> http://docs.python.org/2/library/threading.html
> 
> You don't really have multiple threads running in parallel, try using the 
> multiprocessor library. 

Python _should_ release the GIL around IO-bound work, so this is a
situation where the GIL shouldn't be an issue (It's actually a very good
use for python's threads as there's no serialization overhead for
message passing between processes as there would be in most
multi-process examples)


A constant factor 2 slowdown really doesn't seem that significant for
two different implementations, and I would not worry about this unless
you're talking about thousands of machines..

If you are talking about enough machines that this is real $$$, then I
do think the python code can be optimised a lot.

I'm talking about language/VM specific optimisations - so I'm assuming
cpython (the standard /usr/bin/python as in the shebang).

I don't know how much of a difference this will make, but I'd be
interested in hearing your results:


I would start by trying rewriting this:

  def start_cassandra_client(Threadname):
    f=open(Threadname,"w")
    for key in lines:
      key=key.strip()
      st=time.time()
      f.write(str(cf.get(key))+"\n")
      et=time.time()
      f.write("Time taken for a single query is " +
str(round(1000*(et-st),2))+" milli secs\n")
      f.close()

As something like this:

  def start_cassandra_client(Threadname):
    # Avoid variable names outside this scope
    time_fn = time.time
    colfam = cf
    f=open(Threadname,"w")
    for key in lines:
      key=key.strip()
      st=time_fn()
      f.write(str(colfam.get(key))+"\n")
      et=time_fn()
      f.write("Time taken for a single query is " +
str(round(1000*(et-st),2))+" milli secs\n")
      f.close()


If you don't consider it cheating compared to the java version, I would
also move the "key.strip()" call to the module initiation instead of
doing it once per thread, as there's a lot of function dispatch overhead
in python.


I'd also closely compare the IO going on in both versions (the .write
calls). For example this may be significantly faster:

      et=time_fn()
      f.write(str(colfam.get(key))+"\nTime taken for a single query is "
+ str(round(1000*(et-st),2))+" milli secs\n")


.. I haven't read your java code and I don't know Java IO semantics well
enough to compare the behaviour of both.

Tim




> 
> Cheers
> 
> -----------------
> Aaron Morton
> Freelance Cassandra Developer
> New Zealand
> 
> @aaronmorton
> http://www.thelastpickle.com
> 
> On 5/02/2013, at 7:15 AM, Pradeep Kumar Mantha <pradeep...@gmail.com> wrote:
> 
> > Hi,
> > 
> > Could some one please let me know any hints, why the pycassa 
> > client(attached) is much slower than the YCSB?
> > is it something to attribute to performance difference between python and 
> > Java? or the pycassa api has some performance limitations?
> > 
> > I don't see any client statements affecting the pycassa performance. Please 
> > have a look at the simple python script attached and let me know
> > your suggestions.
> > 
> > thanks
> > pradeep
> > 
> > On Thu, Jan 31, 2013 at 4:53 PM, Pradeep Kumar Mantha 
> > <pradeep...@gmail.com> wrote:
> > 
> > 
> > On Thu, Jan 31, 2013 at 4:49 PM, Pradeep Kumar Mantha 
> > <pradeep...@gmail.com> wrote:
> > Thanks.. Please find the script as attachment.
> > 
> > Just re-iterating.
> > Its just a simple python script which submit 4 threads. 
> > This script has been scheduled on 8 cores using taskset unix command , thus 
> > running 32 threads/node. 
> > and then scaling to 16 nodes
> > 
> > thanks
> > pradeep
> > 
> > 
> > On Thu, Jan 31, 2013 at 4:38 PM, Tyler Hobbs <ty...@datastax.com> wrote:
> > Can you provide the python script that you're using?
> > 
> > (I'm moving this thread to the pycassa mailing list 
> > (pycassa-disc...@googlegroups.com), which is a better place for this 
> > discussion.)
> > 
> > 
> > On Thu, Jan 31, 2013 at 6:25 PM, Pradeep Kumar Mantha 
> > <pradeep...@gmail.com> wrote:
> > Hi,
> > 
> > I am trying to benchmark cassandra on a 12 Data Node cluster using 16 
> > clients ( each client uses 32 threads) using custom pycassa client and YCSB.
> > 
> > I found the maximum number of operations/seconds achieved using pycassa 
> > client is nearly 70k+ reads/second.
> > Whereas with YCSB it is ~ 120k reads/second.
> > 
> > Any thoughts, why I see this huge difference in performance?
> > 
> > 
> > Here is the description of setup.
> > 
> > Pycassa client (a simple python script).
> > 1. Each pycassa client starts 4 threads - where each thread queries 76896 
> > queries.
> > 2. a shell script is used to submit 4threads/each core using taskset unix 
> > command on a 8 core single node. ( 8 * 4 * 76896 queries)
> > 3. Another shell script is used to scale the single node shell script to 16 
> > nodes  ( total queries now - 16 * 8 * 4 * 76896 queries )
> > 
> > I tried to keep YCSB configuration as much as similar to my custom pycassa 
> > benchmarking setup.
> > 
> > YCSB -
> > 
> > Launched 16 YCSB clients on 16 nodes where each client uses 32 threads for 
> > execution and need to query ( 32 * 76896 keys ), i.e 100% reads
> > 
> > The dataset is different in each case, but has
> > 
> > 1. same number of total records.
> > 2. same number of fields.
> > 3. field length is almost same.
> > 
> > Could you please let me know, why I see this huge performance difference 
> > and is there any way I can improve the operations/second using pycassa 
> > client.
> > 
> > thanks
> > pradeep
> >  
> > 
> > 
> > 
> > -- 
> > Tyler Hobbs
> > DataStax
> > 
> > 
> > 
> > <pycassa_client.py>
> 


Reply via email to