I think these are the following configurations that you are looking for:
*spark.locality.wait*: Number of milliseconds to wait to launch a
data-local task before giving up and launching it on a less-local node. The
same wait will be used to step through multiple locality levels
(process-local, nod
Hi there,
I've been using Spark for processing 33,000 gzipped files that contain
billions of JSON records (the metadata [WAT] dataset from Common Crawl).
I've hit a few issues and have not yet found the answers from the
documentation / search. This may well just be me not finding the right
pages t