Cc: Prateek . ; user@spark.apache.org
Subject: Re: Spark job for Reading time series data from Cassandra
Hi,
the spark connector docs say:
(https://github.com/datastax/spark-cassandra-connector/blob/master/doc/FAQ.md)
"The number of Spark partitions(tasks) created is directly controlled b
Hi,
the spark connector docs say: (
https://github.com/datastax/spark-cassandra-connector/blob/master/doc/FAQ.md
)
"The number of Spark partitions(tasks) created is directly controlled by
the setting spark.cassandra.input.split.size_in_mb. This number reflects
the approximate amount of Cassandra
Prateek,
I believe that one task is created per Cassandra partition. How is your
data partitioned?
Regards,
Bryan Jeffrey
On Thu, Mar 10, 2016 at 10:36 AM, Prateek . wrote:
> Hi,
>
>
>
> I have a Spark Batch job for reading timeseries data from Cassandra which
> has 50,000 rows.
>
>
>
>
>
>