Can you share what error you are getting when the job fails. On Thu, Feb 26, 2015 at 4:32 AM, Darin McBeath <ddmcbe...@yahoo.com.invalid> wrote:
> I'm using Spark 1.2, stand-alone cluster on ec2 I have a cluster of 8 > r3.8xlarge machines but limit the job to only 128 cores. I have also tried > other things such as setting 4 workers per r3.8xlarge and 67gb each but > this made no difference. > > The job frequently fails at the end in this step (saveasHadoopFile). It > will sometimes work. > > finalNewBaselinePairRDD is hashPartitioned with 1024 partitions and a > total size around 1TB. There are about 13.5M records in > finalNewBaselinePairRDD. finalNewBaselinePairRDD is <String,String> > > > JavaPairRDD<Text, Text> finalBaselineRDDWritable = > finalNewBaselinePairRDD.mapToPair(new > ConvertToWritableTypes()).persist(StorageLevel.MEMORY_AND_DISK_SER()); > > // Save to hdfs (gzip) > finalBaselineRDDWritable.saveAsHadoopFile("hdfs:///sparksync/", > Text.class, Text.class, > SequenceFileOutputFormat.class,org.apache.hadoop.io.compress.GzipCodec.class); > > > If anyone has any tips for what I should look into it would be appreciated. > > Thanks. > > Darin. > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > > -- [image: Sigmoid Analytics] <http://htmlsig.com/www.sigmoidanalytics.com> *Arush Kharbanda* || Technical Teamlead ar...@sigmoidanalytics.com || www.sigmoidanalytics.com