from:"Jasleen Kaur"

Re: Spark Partition by Columns doesn't work properly

2016-06-08 Thread Jasleen Kaur

The github repo is https://github.com/datastax/spark-cassandra-connector The talk video and slides should be uploaded soon on spark summit website On Wednesday, June 8, 2016, Chanh Le wrote: > Thanks, I'll look into it. Any luck to get link related to. > > On Thu, Jun 9, 2016, 12

Re: Spark Partition by Columns doesn't work properly

2016-06-08 Thread Jasleen Kaur

Try using the datastax package. There was a great talk on spark summit about it. It will take care of the boiler plate code and you can focus on real business value On Wednesday, June 8, 2016, Chanh Le wrote: > Hi everyone, > I tested the partition by columns of data frame but it’s not good I me

Writing to HDFS

2015-08-03 Thread Jasleen Kaur

I am executing a spark job on a cluster as a yarn-client(Yarn cluster not an option due to permission issues). - num-executors 800 - spark.akka.frameSize=1024 - spark.default.parallelism=25600 - driver-memory=4G - executor-memory=32G. - My input size is around 1.5TB. My problem

Re: Spark Partition by Columns doesn't work properly

Re: Spark Partition by Columns doesn't work properly

Writing to HDFS

3 matches

Site Navigation

Mail list logo

Footer information