RE: Spark job for Reading time series data from Cassandra

2016-03-10 Thread Prateek .
Cc: Prateek . ; user@spark.apache.org Subject: Re: Spark job for Reading time series data from Cassandra Hi, the spark connector docs say: (https://github.com/datastax/spark-cassandra-connector/blob/master/doc/FAQ.md) "The number of Spark partitions(tasks) created is directly controlled b

Re: Spark job for Reading time series data from Cassandra

2016-03-10 Thread Matthias Niehoff
Hi, the spark connector docs say: ( https://github.com/datastax/spark-cassandra-connector/blob/master/doc/FAQ.md ) "The number of Spark partitions(tasks) created is directly controlled by the setting spark.cassandra.input.split.size_in_mb. This number reflects the approximate amount of Cassandra

Re: Spark job for Reading time series data from Cassandra

2016-03-10 Thread Bryan Jeffrey
Prateek, I believe that one task is created per Cassandra partition. How is your data partitioned? Regards, Bryan Jeffrey On Thu, Mar 10, 2016 at 10:36 AM, Prateek . wrote: > Hi, > > > > I have a Spark Batch job for reading timeseries data from Cassandra which > has 50,000 rows. > > > > > >