Hey'all, As Jonathan pointed out in CASSANDRA-1199, this issue seams to be related to https://issues.apache.org/jira/browse/THRIFT-788. If you experience slowness with multiget_slice, take a look at that bug.
-Arya ----- Original Message ----- From: "Arya Goudarzi" <agouda...@gaiaonline.com> To: user@cassandra.apache.org, "jbellis" <jbel...@gmail.com> Sent: Wednesday, June 9, 2010 4:51:18 PM Subject: Re: Strage Read Perfoamnce 1xN column slice or N column slice Hi Jonathan, This issue persists. I have prepared a code sample which you can use to reproduce what I am saying. Please see attached. It is using Thrift PHP libraries straight. I am running Cassandra 0.7 build from May 28th. I have tried this on a single host with replication factor 1 and 3 node cluster with replication factor 3. The results remains similar: 100 Sequential Writes took: 0.60781407356262 seconds; 100 Sequential Reads took: 0.23204588890076 seconds; 100 Batch Read took: 0.76512885093689 seconds; Please advice. Thank You, -Arya ----- Original Message ----- From: "Jonathan Ellis" <jbel...@gmail.com> To: user@cassandra.apache.org Sent: Monday, June 7, 2010 7:26:30 PM Subject: Re: Strage Read Perfoamnce 1xN column slice or N column slice That would be surprising (and it is not what you said in the first message). I suspect something is wrong with your test methodology. On Mon, Jun 7, 2010 at 11:23 AM, Arya Goudarzi <agouda...@gaiaonline.com> wrote: > But I am not comparing reading 1 column vs 100 columns. I am comparing > reading of 100 columns in loop iterations (100 consecutive calls) vs > reading all 100 in batch in one call. Doing the loop is faster than > doing the batch call. Are you saying this is not surprising? > > ----- Original Message ----- > From: "Jonathan Ellis" <jbel...@gmail.com> > To: user@cassandra.apache.org > Sent: Saturday, June 5, 2010 6:26:46 AM > Subject: Re: Strage Read Perfoamnce 1xN column slice or N column slice > > reading 1 column, is faster than reading lots of columns. this > shouldn't be surprising. > > On Fri, Jun 4, 2010 at 3:52 PM, Arya Goudarzi > <agouda...@gaiaonline.com> > wrote: >> Hi Fellows, >> >> I have the following design for a system which holds basically >> key->value pairs (aka Columns) for each user (SuperColumn Key) in >> different namespaces >> (SuperColumnFamily row key). >> >> Like this: >> >> Namesapce->user->column_name = column_value; >> >> keyspaces: >> - name: NKVP >> replica_placement_strategy: >> org.apache.cassandra.locator.RackUnawareStrategy >> replication_factor: 3 >> column_families: >> - name: Namespaces >> column_type: Super >> compare_with: BytesType >> compare_subcolumns_with: BytesType >> rows_cached: 20000 >> keys_cached: 100 >> >> Cluster using random partitioner. >> >> I use multiget_slice() for fetching 1 or many columns inside the >> child supercolumn at the same time. This is an awkward performance >> result I >> get: >> >> 100 sequential reads completed in : 0.383 this uses multiget_slice() >> with 1 key, and 1 column name inside the predicate->column_names >> 100 batch loaded completed in : 0.786 this uses multiget_slice() with >> 1 key, and multiple column names inside the predicate->column_names >> >> read/write consistency are ONE. >> >> Questions: >> >> Why doing 100 sequential reads is faster than doing 100 in batch? >> Is this a good design for my problem? >> Does my issue relate to >> https://issues.apache.org/jira/browse/CASSANDRA-598? >> >> Now on a single node with replication factor 1 I get this: >> >> 100 sequential reads completed in : 0.438 >> 100 batch loaded completed in : 0.800 >> >> Please advice as to why is this happening? >> >> These nodes are VMs. 1 CPU and 1 Gb. >> >> Best Regards, >> =Arya >> >> >> >> >> >> >> >> > > > > -- Jonathan Ellis > Project Chair, Apache Cassandra > co-founder of Riptano, the source for professional Cassandra support > http://riptano.com > -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of Riptano, the source for professional Cassandra support http://riptano.com