Re: Delaying failed task retries + giving failing tasks to different nodes

2015-04-02 Thread Akhil Das
I think these are the following configurations that you are looking for: *spark.locality.wait*: Number of milliseconds to wait to launch a data-local task before giving up and launching it on a less-local node. The same wait will be used to step through multiple locality levels (process-local, nod

Delaying failed task retries + giving failing tasks to different nodes

2015-04-02 Thread Stephen Merity
Hi there, I've been using Spark for processing 33,000 gzipped files that contain billions of JSON records (the metadata [WAT] dataset from Common Crawl). I've hit a few issues and have not yet found the answers from the documentation / search. This may well just be me not finding the right pages t