subject:"Re\: a question about fault recovery"

Re: a question about fault recovery

2014-03-04 Thread dachuan

Thanks for the clarification, so now I understand that only failed tasks will re-scheduled, and only the input partitions of these tasks will be re-computed. Another confusing point from paper is: "Because of these properties, D-Streams can parallelize recovery over hundreds of cores and recover

Re: a question about fault recovery

2014-03-04 Thread Reynold Xin

BlockManager is only responsible for in-memory/on-disk storage. It has nothing to do with re-computation. All the recomputation / retry code are done in the DAGScheduler. Note that when a node crashes, due to lazy evaluation, there is no task that needs to be re-run. Those tasks are re-run only wh