Hi,
running spark 1.1.0 in yarn-client mode (cdh 5.2.1) on XEN based cloud and 
randomly getting my executors failing on errors like bellow. I suspect it is 
some cloud networking issue (XEN driver bug?) but wondering if there is any 
spark/yarn workaround that I could use to mitigate?
Thanks,Antony
15/01/14 10:36:44 ERROR storage.BlockFetcherIterator$BasicBlockFetcherIterator: 
Could not get block(s) from ConnectionManagerId(node02,40868)
java.io.IOException: sendMessageReliably failed without being ACK'd        at 
org.apache.spark.network.ConnectionManager$$anonfun$14.apply(ConnectionManager.scala:865)
        at 
org.apache.spark.network.ConnectionManager$$anonfun$14.apply(ConnectionManager.scala:861)
        at 
org.apache.spark.network.ConnectionManager$MessageStatus.markDone(ConnectionManager.scala:66)
        at 
org.apache.spark.network.ConnectionManager$$anonfun$removeConnection$3.apply(ConnectionManager.scala:457)
        at 
org.apache.spark.network.ConnectionManager$$anonfun$removeConnection$3.apply(ConnectionManager.scala:455)
        at scala.collection.immutable.List.foreach(List.scala:318)        at 
org.apache.spark.network.ConnectionManager.removeConnection(ConnectionManager.scala:455)
        at 
org.apache.spark.network.ConnectionManager$$anonfun$addListeners$3.apply(ConnectionManager.scala:434)
        at 
org.apache.spark.network.ConnectionManager$$anonfun$addListeners$3.apply(ConnectionManager.scala:434)
        at 
org.apache.spark.network.Connection.callOnCloseCallback(Connection.scala:156)   
     at org.apache.spark.network.Connection.close(Connection.scala:128)        
at 
org.apache.spark.network.ConnectionManager.removeConnection(ConnectionManager.scala:476)
        at 
org.apache.spark.network.ConnectionManager$$anonfun$addListeners$3.apply(ConnectionManager.scala:434)
        at 
org.apache.spark.network.ConnectionManager$$anonfun$addListeners$3.apply(ConnectionManager.scala:434)
        at 
org.apache.spark.network.Connection.callOnCloseCallback(Connection.scala:156)   
     at org.apache.spark.network.Connection.close(Connection.scala:128)        
at org.apache.spark.network.ReceivingConnection.read(Connection.scala:491)      
  at 
org.apache.spark.network.ConnectionManager$$anon$7.run(ConnectionManager.scala:199)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918) 
       at java.lang.Thread.run(Thread.java:662)

Reply via email to