Thanks Riyad.

Right now I am just testing Cassandra on single node. The server and client
are running on the same machine. I tried the read test again on two
machines, on one machine the cpu usage is around 30% most of the time and
another is 90%.

Pelops is one way to access Cassandra, there are also other java client like
hector and jassandra, will these java clients have significant different
performance?

Also I once tried to change the storage configure file, like change
CommitLogDirectory and DataFileDirectory to different disks, change
DiskAccessMode to mmap for a 64bit machine, and change ConcurrentReads from
8 to 2. All of these do not change performance much.

For other users who use different access client, like using php, c++,
python, etc, if you have any experience in boosting the read performance,
you are more than welcome to share with me. Thanks,

On Fri, Jun 11, 2010 at 8:19 AM, Riyad Kalla <rka...@gmail.com> wrote:

> Caribbean410,
>
> This comes up on the Redis list alot as well -- what you are actually
> measuring is the client sending a network connection to the Cas server and
> it replying -- so the performance numbers you are getting can easily be 70%
> network wait time and not necessarily hardcore read/write server
> performance.
>
> One way to see if this is the case, run your read test, then watch the CPU
> on the server for the Cassandra process and see if it's pegging the CPU --
> if it's just sitting there banging between 0-10%, the you are spending most
> of your time waiting on network i/o (open/close sockets, etc.)
>
> If you can parallelize your test to spawn say 5 threads that all do the
> same thing, see if the performance for each thread increases linearly --
> which would indicate Cassandra is plenty fast in your setup, you just need
> to utilize more client threads over the network.
>
> That new Java library, Pelops by Dominic (
> http://ria101.wordpress.com/2010/06/11/pelops-the-beautiful-cassandra-database-client-for-java/)
> has a nice intrinsic node-balancing design that could be handy IF you are
> using multiple nodes. If you are just testing against 1 node, then spawn
> multiple threads of your code above and see how each thread's performance
> scales.
>
> -R
>
> On Thu, Jun 10, 2010 at 2:39 PM, Caribbean410 <caribbean...@gmail.com>wrote:
>
>> Hello,
>>
>> I am testing the performance of cassandra. We write 200k records to
>> database and each record is 1k size. Then we read these 200k records.
>> It takes more than 400s to finish the read which is much slower than
>> mysql (20s around). I read some discussion online and someone suggest
>> to make multiple connections to make it faster. But I am not sure how
>> to do it, do I need to change my storage setting file or just change
>> the java client code?
>>
>> Here is my read code,
>>
>>                     Properties info = new Properties();
>>                     info.put(DriverManager.CONSISTENCY_LEVEL,
>>                               ConsistencyLevel.ONE.toString());
>>
>>                     IConnection connection = DriverManager.getConnection(
>>                                 "thrift://localhost:9160", info);
>>
>>                       // 2. Get a KeySpace by name
>>                       IKeySpace keySpace =
>> connection.getKeySpace("Keyspace1");
>>
>>                       // 3. Get a ColumnFamily by name
>>                       IColumnFamily cf =
>> keySpace.getColumnFamily("Standard2");
>>
>>                       ByteArray nameFirst = ByteArray.ofASCII("first");
>>                       ICriteria criteria = cf.createCriteria();
>>                       long readBytes = 0;
>>                       long start = System.currentTimeMillis();
>>                           for (int i = 0; i < numOfRecords; i++) {
>>                                   int n = random.nextInt(numOfRecords);
>>                                       userName = keySet[n];
>>
>> criteria.keyList(Lists.newArrayList(userName)).columnRange(nameFirst,
>> nameFirst, 10);
>>                                       Map<String, List<IColumn>> map =
>> criteria.select();
>>                                       List<IColumn> list =
>> map.get(userName);
>>                                       ByteArray bloc =
>> list.get(0).getValue();
>>                                       byte[] byteArrayloc =
>> bloc.toByteArray();
>>                                       loc = new String(byteArrayloc);
>> //                                    System.out.println(userName+"
>> "+loc);
>>                                       readBytes = readBytes +
>> loc.length();
>>                           }
>>
>>                         long finish=System.currentTimeMillis();
>>
>> I once commented these lines
>>
>>                                       ByteArray bloc =
>> list.get(0).getValue();
>>                                       byte[] byteArrayloc =
>> bloc.toByteArray();
>>                                       loc = new String(byteArrayloc);
>> //                                    System.out.println(userName+"
>> "+loc);
>>                                       readBytes = readBytes +
>> loc.length();
>>
>> And the performance doesn't improve much.
>>
>> Any suggestion is welcome. Thanks,
>>
>
>

Reply via email to