I upped the ulimit to 128k files on all nodes. Job crashed again with
"DAGScheduler: Failed to run runJob at ReceiverTracker.scala:275".
Couldn't get the logs because I killed the job and looks like yarn
wipe the container logs (not sure why it wipes the logs under
/var/log/hadoop
2014 at 3:54 PM, Tim Smith wrote:
>>
>>> Hi,
>>>
>>> Have a Spark-1.0.0 (CDH5) streaming job reading from kafka that died
>>> with:
>>>
>>> 14/08/28 22:28:15 INFO DAGScheduler: Failed to run runJob at
>>> ReceiverTracker.scala:275
ing from kafka that died with:
>>
>> 14/08/28 22:28:15 INFO DAGScheduler: Failed to run runJob at
>> ReceiverTracker.scala:275
>> Exception in thread "Thread-59" 14/08/28 22:28:15 INFO
>> YarnClientClusterScheduler: Cancelling stage 2
>> 14/08/28 22:28:1
that caused the executor to fail.
TD
On Thu, Aug 28, 2014 at 3:54 PM, Tim Smith wrote:
> Hi,
>
> Have a Spark-1.0.0 (CDH5) streaming job reading from kafka that died with:
>
> 14/08/28 22:28:15 INFO DAGScheduler: Failed to run runJob at
> ReceiverTracker.scala:275
> Except
Hi,
Have a Spark-1.0.0 (CDH5) streaming job reading from kafka that died with:
14/08/28 22:28:15 INFO DAGScheduler: Failed to run runJob at
ReceiverTracker.scala:275
Exception in thread "Thread-59" 14/08/28 22:28:15 INFO
YarnClientClusterScheduler: Cancelling stage 2
14/08/28 22: