Do we not have an option to store the map results in hdfs?

Billy

"Owen O'Malley" <[EMAIL PROTECTED]> wrote in message news:[EMAIL PROTECTED]
It isn't optimal, but it is the expected behavior. In general when we lose a TaskTracker, we want the map outputs regenerated so that any reduces that need to re-run (including speculative execution). We could handle it as a special case if:
  1. We didn't lose any running reduces.
2. All of the reduces (including speculative tasks) are done with shuffling.
  3. We don't plan on launching any more speculative reduces.
If all 3 hold, we don't need to re-run the map tasks. Actually doing so, would be a pretty involved patch to the JobTracker/Schedulers.

-- Owen



Reply via email to