Hi, Not sure this is the case for your Bad Performance, but you are Meassuring Data creation and Insertion together. Your Data creation involves Lots of class casts which are probably quite Slow. Try Timing only the b.send Part and See how Long that Takes.
Roland Am 03.05.2011 um 12:30 schrieb "charles THIBAULT" <charl.thiba...@gmail.com>: > Hello everybody, > > first: sorry for my english in advance!! > > I'm getting started with Cassandra on a 5 nodes cluster inserting data > with the pycassa API. > > I've read everywere on internet that cassandra's performance are better than > MySQL > because of the writes append's only into commit logs files. > > When i'm trying to insert 100 000 rows with 10 columns per row with batch > insert, I'v this result: 27 seconds > But with MySQL (load data infile) this take only 2 seconds (using indexes) > > Here my configuration > > cassandra version: 0.7.5 > nodes : 192.168.1.210, 192.168.1.211, 192.168.1.212, 192.168.1.213, > 192.168.1.214 > seed: 192.168.1.210 > > My script > ************************************************************************************************************* > #!/usr/bin/env python > > import pycassa > import time > import random > from cassandra import ttypes > > pool = pycassa.connect('test', ['192.168.1.210:9160']) > cf = pycassa.ColumnFamily(pool, 'test') > b = cf.batch(queue_size=50, > write_consistency_level=ttypes.ConsistencyLevel.ANY) > > tps1 = time.time() > for i in range(100000): > columns = dict() > for j in range(10): > columns[str(j)] = str(random.randint(0,100)) > b.insert(str(i), columns) > b.send() > tps2 = time.time() > > > print("execution time: " + str(tps2 - tps1) + " seconds") > ************************************************************************************************************* > > what I'm doing rong ?