Re: Slow reads on C* 2.0.15 using Spark Cassandra

2015-06-29 Thread Nathan Bijnens
One more update, it looks like the driver is generating this CQL statements: SELECT "test_id", "channel", "ts", "event", "groups" FROM "KEYSPACE"."test" WHERE token("test_id") > ? AND token("test_id") <= ? ALLOW FILTERING; Best regards, Nathan On Fri, Jun 26, 2015 at 8:16 PM Nathan Bijnens

Re: Slow reads on C* 2.0.15 using Spark Cassandra

2015-06-26 Thread Nathan Bijnens
Thanks for the suggestion, will take a look. Our code looks like this: val rdd = sc.cassandraTable[EventV0](keyspace, "test") val transformed = rdd.map{e => EventV1(e.testId, e.ts, e.channel, e.groups, e.event)} transformed.saveToCassandra(keyspace, "test_v1") Not sure if this code might transl

Re: Slow reads on C* 2.0.15 using Spark Cassandra

2015-06-26 Thread Nate McCall
> We notice incredibly slow reads, 600mb in an hour, we are using quorum LOCAL_ONE reads. > The load_one of Cassandra increases from <1 to 60! There is no CPU wait, only user & nice. Without seeing the code and query, it's hard to tell, but I noticed something similar when we had a client incorrec

Re: Slow Reads in Cassandra with Hadoop

2012-12-11 Thread aaron morton
First I would try to simplify your architecture. Get everything onto the same OS. Then change the topology so you have 1 job tracker, and 4 nodes that ran both Cassandra and Hadoop tasks. So that reading and mapping the data is happening on the same nodes. Reads from cassandra happen as range

Re: Slow Reads

2011-07-27 Thread Jake Luciani
The philosophy in no-sql is to store the data as you plan to access it. that means duplicating the data many time possibly. Disk is cheap, writes are fast. On Wed, Jul 27, 2011 at 2:22 PM, Priyanka wrote: > Thank you Indra for your suggestion. > But the thing is apart from pulling data based o

Re: Slow Reads

2011-07-27 Thread Priyanka
Thank you Indra for your suggestion. But the thing is apart from pulling data based on supercol in the below example I also need to query to pull the data based on a particular rowkey.If I change the model as u mentioned this query becomes slow. I need to do both the retrievals efficiently. -- Vie

Re: Slow Reads

2011-07-27 Thread Indranath Ghosh
You might want to avoid super columns and denormalize your schema... Since you are querying by the supercoumns... you can make them the rowkeys and current rowkeys can be your column names.. and using composite column names to get to the columns faster. Something like this (used your representation

Re: Slow Reads

2011-07-27 Thread Priyanka Ganuthula
Yes am using hector for java On Wed, Jul 27, 2011 at 3:35 AM, CASSANDRA learner < cassandralear...@gmail.com> wrote: > R u using hector client for java > > > On Tue, Jul 26, 2011 at 11:17 PM, Priyanka wrote: > >> this is how my data looks >> “rowkey1”:{ >>“supercol1”:{ “col1”:T,”col2

Re: Slow Reads

2011-07-27 Thread CASSANDRA learner
R u using hector client for java On Tue, Jul 26, 2011 at 11:17 PM, Priyanka wrote: > this is how my data looks > “rowkey1”:{ >“supercol1”:{ “col1”:T,”col2”:C} >“supercol2”:{“col1”:C,”col2”:T } >“supercol3”:{ “col1”:C,”col2”:T} >} > "rowkey2”:{

Re: Slow Reads

2011-07-26 Thread Priyanka
this is how my data looks “rowkey1”:{ “supercol1”:{ “col1”:T,”col2”:C} “supercol2”:{“col1”:C,”col2”:T } “supercol3”:{ “col1”:C,”col2”:T} } "rowkey2”:{ “supercol1”:{ “col1”:A,”col2”:A} “supercol2”:{“col1”:A,”col2”:T }

Re: Slow Reads

2011-07-26 Thread Priyanka Ganuthula
Supercolumn has two columns and each column has only one byte. It is a bit faster but not significant. On Tue, Jul 26, 2011 at 12:49 PM, Jake Luciani wrote: > It doesn't read the entire row, but it does read a section of the row from > disk... > > How big is each supercolumn? If you re-read the

Re: Slow Reads

2011-07-26 Thread Priyanka Ganuthula
Thanks Philippe , I have a question here...I am specifying the required super column.Does it still need to read the entire row? Or is it because am listing all the slices and then going to each slice and picking data for the required super column? SlicePredicate slicePredicate = new SlicePredicate(

Re: Slow Reads

2011-07-26 Thread Sylvain Lebresne
On Tue, Jul 26, 2011 at 5:39 PM, Priyanka wrote: > > Hello All, > >          I am doing some read tests on Cassandra on a single node.But they > are turning up to be very slow. > Here is the data model in detail. > I am using a super column family.Cassandra has total 970 rows and each row > has 62

Re: Slow Reads

2011-07-26 Thread Jake Luciani
It doesn't read the entire row, but it does read a section of the row from disk... How big is each supercolumn? If you re-read the data does the query time get faster? On Tue, Jul 26, 2011 at 11:59 AM, Philippe wrote: > i believe it's because it needs to read the whole row to get to your sup

Re: Slow Reads

2011-07-26 Thread Philippe
i believe it's because it needs to read the whole row to get to your super column. you might have to reconsider your model. Le 26 juil. 2011 17:39, "Priyanka" a écrit : > > Hello All, > > I am doing some read tests on Cassandra on a single node.But they > are turning up to be very slow. > Here is