Sounds reasonable.
Please consider posting question on Spark C* connector on their mailing
list if you have any.
On Sun, Feb 14, 2016 at 7:51 PM, Kevin Burton wrote:
> Afternoon.
>
> About 6 months ago I tried (and failed) to get Spark and Cassandra working
> together in production due to depen
Afternoon.
About 6 months ago I tried (and failed) to get Spark and Cassandra working
together in production due to dependency hell.
I'm going to give it another try!
Here's my general strategy.
I'm going to create a maven module for my code... with spark dependencies.
Then I'm going to get th
e.spark.rdd.RDD.foreach(RDD.scala:797)
Do you have any idea ?
To conclude, I would like to but my map on a cassandra table from my
rddvalues org.apache.spark.rdd.RDD[scala.collection.Map[String,Any]]
Best regards,
--
View this message in context:
http://apache-spark-user-l
Thank you Cody!!
I am going to try with the two settings you have mentioned.
We are currently running with Spark standalone cluster manager.
Thanks
Ankur
On Wed, Jan 7, 2015 at 1:20 PM, Cody Koeninger wrote:
> General ideas regarding too many open files:
>
> Make sure ulimit is actually being
General ideas regarding too many open files:
Make sure ulimit is actually being set, especially if you're on mesos
(because of https://issues.apache.org/jira/browse/MESOS-123 ) Find the pid
of the executor process, and cat /proc//limits
set spark.shuffle.consolidateFiles = true
try spark.shuffl
Hello,
We are currently running our data pipeline on spark which uses Cassandra as
the data source.
We are currently facing issue with the step where we create an rdd on data
in cassandra table and then try to run "flatMapToPair" to transform the
data but we are running into "Too many open files"