[ 
https://issues.apache.org/jira/browse/SPARK-21867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16288298#comment-16288298
 ] 

Eric Vandenberg commented on SPARK-21867:
-----------------------------------------

1. The default would be 1 so does not change default behavior.  It is currently 
configurable to set this higher (spark.shuffle.async.num.sorter=2).
2. Yes, the number of spill files could increase, that's one reason this is not 
on by default.  This could be an issue if it hits file system limits, etc in 
extreme cases.  For the jobs we've tested, this wasn't as a problem.  We think 
this improvement has biggest impact on larger jobs (we've seen cpu reduction by 
~30% in some large jobs), it may not help as much for smaller jobs with fewer 
spills.
3. When sorter hits the threshold, it will kick off an asynchronous spill and 
then continue inserting into another sorter (assuming one is available.)  It 
could make sense to raise the threshold, this would result in larger spill 
files.  There is some risk that raising it might push too high causing an OOM 
and then needing to lower again.  I'm thinking the algorithm could be improved 
by more accurately calculating and enforcing the threshold based on available 
memory over time, however, to do this would require exposing some memory 
allocation metrics not currently available (in the memory manager), so opt'd to 
not do that for now.
4. Yes, too many open files/buffers could be an issue.  So for now this is 
something should look at enabling case by case as part of performance tuning.


> Support async spilling in UnsafeShuffleWriter
> ---------------------------------------------
>
>                 Key: SPARK-21867
>                 URL: https://issues.apache.org/jira/browse/SPARK-21867
>             Project: Spark
>          Issue Type: Improvement
>          Components: Spark Core
>    Affects Versions: 2.2.0
>            Reporter: Sital Kedia
>            Priority: Minor
>         Attachments: Async ShuffleExternalSorter.pdf
>
>
> Currently, Spark tasks are single-threaded. But we see it could greatly 
> improve the performance of the jobs, if we can multi-thread some part of it. 
> For example, profiling our map tasks, which reads large amount of data from 
> HDFS and spill to disks, we see that we are blocked on HDFS read and spilling 
> majority of the time. Since both these operations are IO intensive the 
> average CPU consumption during map phase is significantly low. In theory, 
> both HDFS read and spilling can be done in parallel if we had additional 
> memory to store data read from HDFS while we are spilling the last batch read.
> Let's say we have 1G of shuffle memory available per task. Currently, in case 
> of map task, it reads from HDFS and the records are stored in the available 
> memory buffer. Once we hit the memory limit and there is no more space to 
> store the records, we sort and spill the content to disk. While we are 
> spilling to disk, since we do not have any available memory, we can not read 
> from HDFS concurrently. 
> Here we propose supporting async spilling for UnsafeShuffleWriter, so that we 
> can support reading from HDFS when sort and spill is happening 
> asynchronously.  Let's say the total 1G of shuffle memory can be split into 
> two regions - active region and spilling region - each of size 500 MB. We 
> start with reading from HDFS and filling the active region. Once we hit the 
> limit of active region, we issue an asynchronous spill, while fliping the 
> active region and spilling region. While the spil is happening 
> asynchronosuly, we still have 500 MB of memory available to read the data 
> from HDFS. This way we can amortize the high disk/network io cost during 
> spilling.
> We made a prototype hack to implement this feature and we could see our map 
> tasks were as much as 40% faster. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to