Dear all, When callinig an external process with RDD.pipe I got the following error :
*Not interrupting system thread Thread[process reaper,10,system]* *Not interrupting system thread Thread[process reaper,10,system]* *Not interrupting system thread Thread[process reaper,10,system]* *14/09/01 10:10:51 ERROR Utils: Uncaught exception in thread SparkListenerBus* *java.lang.InterruptedException* * at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:996)* * at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1303)* * at java.util.concurrent.Semaphore.acquire(Semaphore.java:317)* * at org.apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1.apply$mcV$sp(LiveListenerBus.scala:48)* * at org.apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1.apply(LiveListenerBus.scala:47)* * at org.apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1.apply(LiveListenerBus.scala:47)* * at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1160)* * at org.apache.spark.scheduler.LiveListenerBus$$anon$1.run(LiveListenerBus.scala:46)* *14/09/01 10:10:51 INFO ConnectionManager: Selector thread was interrupted!* *14/09/01 10:10:51 ERROR ContextCleaner: Error in cleaning thread* *java.lang.InterruptedException* * at java.lang.Object.wait(Native Method)* * at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:135)* * at org.apache.spark.ContextCleaner$$anonfun$org$apache$spark$ContextCleaner$$keepCleaning$1.apply$mcV$sp(ContextCleaner.scala:117)* * at org.apache.spark.ContextCleaner$$anonfun$org$apache$spark$ContextCleaner$$keepCleaning$1.apply(ContextCleaner.scala:115)* * at org.apache.spark.ContextCleaner$$anonfun$org$apache$spark$ContextCleaner$$keepCleaning$1.apply(ContextCleaner.scala:115)* * at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1160)* * at org.apache.spark.ContextCleaner.org <http://org.apache.spark.ContextCleaner.org>$apache$spark$ContextCleaner$$keepCleaning(ContextCleaner.scala:114)* * at org.apache.spark.ContextCleaner$$anon$3.run(ContextCleaner.scala:65)* *[success] Total time: 30 s, completed 1 sept. 2014 10:10:51* *1 sept. 2014 10:10:42 INFOS: parquet.hadoop.ParquetInputFormat: Total input paths to process : 2* *1 sept. 2014 10:10:42 INFOS: parquet.hadoop.ParquetFileReader: reading summary file: file:/home/jrabarisoa/wip/value-spark/data/videos.parquet/_metadata* *1 sept. 2014 10:10:42 AVERTISSEMENT: parquet.hadoop.ParquetRecordReader: Can not initialize counter due to context is not a instance of TaskInputOutputContext, but is org.apache.hadoop.mapreduce.task.TaskAttemptContextImpl* *1 sept. 2014 10:10:42 AVERTISSEMENT: parquet.hadoop.ParquetRecordReader: Can not initialize counter due to context is not a instance of TaskInputOutputContext, but is org.apache.hadoop.mapreduce.task.TaskAttemptContextImpl* *1 sept. 2014 10:10:42 INFOS: parquet.hadoop.InternalParquetRecordReader: RecordReader initialized will read a total of 2 records.* *1 sept. 2014 10:10:42 INFOS: parquet.hadoop.InternalParquetRecordReader: RecordReader initialized will read a total of 2 records.* *1 sept. 2014 10:10:42 INFOS: parquet.hadoop.InternalParquetRecordReader: at row 0. reading next block* *1 sept. 2014 10:10:42 INFOS: parquet.hadoop.InternalParquetRecordReader: at row 0. reading next block* *1 sept. 2014 10:10:42 INFOS: parquet.hadoop.InternalParquetRecordReader: block read in memory in 21 ms. row count = 2* *1 sept. 2014 10:10:42 INFOS: parquet.hadoop.InternalParquetRecordReader: block read in memory in 21 ms. row count = 2* Every thing seems to work properly except the context cleaning step. Any ideas ? Best regards, Jaonary