Thanks Riyad. Right now I am just testing Cassandra on single node. The server and client are running on the same machine. I tried the read test again on two machines, on one machine the cpu usage is around 30% most of the time and another is 90%.
Pelops is one way to access Cassandra, there are also other java client like hector and jassandra, will these java clients have significant different performance? Also I once tried to change the storage configure file, like change CommitLogDirectory and DataFileDirectory to different disks, change DiskAccessMode to mmap for a 64bit machine, and change ConcurrentReads from 8 to 2. All of these do not change performance much. For other users who use different access client, like using php, c++, python, etc, if you have any experience in boosting the read performance, you are more than welcome to share with me. Thanks, On Fri, Jun 11, 2010 at 8:19 AM, Riyad Kalla <rka...@gmail.com> wrote: > Caribbean410, > > This comes up on the Redis list alot as well -- what you are actually > measuring is the client sending a network connection to the Cas server and > it replying -- so the performance numbers you are getting can easily be 70% > network wait time and not necessarily hardcore read/write server > performance. > > One way to see if this is the case, run your read test, then watch the CPU > on the server for the Cassandra process and see if it's pegging the CPU -- > if it's just sitting there banging between 0-10%, the you are spending most > of your time waiting on network i/o (open/close sockets, etc.) > > If you can parallelize your test to spawn say 5 threads that all do the > same thing, see if the performance for each thread increases linearly -- > which would indicate Cassandra is plenty fast in your setup, you just need > to utilize more client threads over the network. > > That new Java library, Pelops by Dominic ( > http://ria101.wordpress.com/2010/06/11/pelops-the-beautiful-cassandra-database-client-for-java/) > has a nice intrinsic node-balancing design that could be handy IF you are > using multiple nodes. If you are just testing against 1 node, then spawn > multiple threads of your code above and see how each thread's performance > scales. > > -R > > On Thu, Jun 10, 2010 at 2:39 PM, Caribbean410 <caribbean...@gmail.com>wrote: > >> Hello, >> >> I am testing the performance of cassandra. We write 200k records to >> database and each record is 1k size. Then we read these 200k records. >> It takes more than 400s to finish the read which is much slower than >> mysql (20s around). I read some discussion online and someone suggest >> to make multiple connections to make it faster. But I am not sure how >> to do it, do I need to change my storage setting file or just change >> the java client code? >> >> Here is my read code, >> >> Properties info = new Properties(); >> info.put(DriverManager.CONSISTENCY_LEVEL, >> ConsistencyLevel.ONE.toString()); >> >> IConnection connection = DriverManager.getConnection( >> "thrift://localhost:9160", info); >> >> // 2. Get a KeySpace by name >> IKeySpace keySpace = >> connection.getKeySpace("Keyspace1"); >> >> // 3. Get a ColumnFamily by name >> IColumnFamily cf = >> keySpace.getColumnFamily("Standard2"); >> >> ByteArray nameFirst = ByteArray.ofASCII("first"); >> ICriteria criteria = cf.createCriteria(); >> long readBytes = 0; >> long start = System.currentTimeMillis(); >> for (int i = 0; i < numOfRecords; i++) { >> int n = random.nextInt(numOfRecords); >> userName = keySet[n]; >> >> criteria.keyList(Lists.newArrayList(userName)).columnRange(nameFirst, >> nameFirst, 10); >> Map<String, List<IColumn>> map = >> criteria.select(); >> List<IColumn> list = >> map.get(userName); >> ByteArray bloc = >> list.get(0).getValue(); >> byte[] byteArrayloc = >> bloc.toByteArray(); >> loc = new String(byteArrayloc); >> // System.out.println(userName+" >> "+loc); >> readBytes = readBytes + >> loc.length(); >> } >> >> long finish=System.currentTimeMillis(); >> >> I once commented these lines >> >> ByteArray bloc = >> list.get(0).getValue(); >> byte[] byteArrayloc = >> bloc.toByteArray(); >> loc = new String(byteArrayloc); >> // System.out.println(userName+" >> "+loc); >> readBytes = readBytes + >> loc.length(); >> >> And the performance doesn't improve much. >> >> Any suggestion is welcome. Thanks, >> > >