Is the speculative execution enabled? Best,
Haoyuan On Mon, Aug 11, 2014 at 8:08 AM, chutium <teng....@gmail.com> wrote: > sharing /reusing RDDs is always useful for many use cases, is this possible > via persisting RDD on tachyon? > > such as off heap persist a named RDD into a given path (instead of > /tmp_spark_tachyon/spark-xxx-xxx-xxx) > or > saveAsParquetFile on tachyon > > i tried to save a SchemaRDD on tachyon, > > val parquetFile = > > sqlContext.parquetFile("hdfs://test01.zala:8020/user/hive/warehouse/parquet_tables.db/some_table/") > parquetFile.saveAsParquetFile("tachyon://test01.zala:19998/parquet_1") > > but always error, first error message is: > > 14/08/11 16:19:28 INFO storage.BlockManagerInfo: Added broadcast_6_piece0 > in > memory on test03.zala:37377 (size: 18.7 KB, free: 16.6 GB) > 14/08/11 16:20:06 WARN scheduler.TaskSetManager: Lost task 1.0 in stage 3.0 > (TID 35, test04.zala): java.io.IOException: > FailedToCheckpointException(message:Failed to rename > hdfs://test01.zala:8020/tmp/tachyon/workers/1407760000003/31806/730 to > hdfs://test01.zala:8020/tmp/tachyon/data/730) > tachyon.worker.WorkerClient.addCheckpoint(WorkerClient.java:112) > tachyon.client.TachyonFS.addCheckpoint(TachyonFS.java:168) > tachyon.client.FileOutStream.close(FileOutStream.java:104) > > > org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:70) > > org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:103) > parquet.hadoop.ParquetFileWriter.end(ParquetFileWriter.java:321) > > > parquet.hadoop.InternalParquetRecordWriter.close(InternalParquetRecordWriter.java:111) > > parquet.hadoop.ParquetRecordWriter.close(ParquetRecordWriter.java:73) > > org.apache.spark.sql.parquet.InsertIntoParquetTable.org > $apache$spark$sql$parquet$InsertIntoParquetTable$$writeShard$1(ParquetTableOperations.scala:259) > > > org.apache.spark.sql.parquet.InsertIntoParquetTable$$anonfun$saveAsHadoopFile$1.apply(ParquetTableOperations.scala:272) > > > org.apache.spark.sql.parquet.InsertIntoParquetTable$$anonfun$saveAsHadoopFile$1.apply(ParquetTableOperations.scala:272) > org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:62) > org.apache.spark.scheduler.Task.run(Task.scala:54) > > org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:199) > > > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > > > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > java.lang.Thread.run(Thread.java:722) > > > > hdfs://test01.zala:8020/tmp/tachyon/ > already chmod to 777, both owner and group is same as spark/tachyon startup > user > > off-heap persist or saveAs normal text file on tachyon works fine. > > CDH 5.1.0, spark 1.1.0 snapshot, tachyon 0.6 snapshot > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/share-reuse-off-heap-persisted-tachyon-RDD-in-SparkContext-or-saveAsParquetFile-on-tachyon-in-SQLCont-tp11897.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > > -- Haoyuan Li AMPLab, EECS, UC Berkeley http://www.cs.berkeley.edu/~haoyuan/