[ 
https://issues.apache.org/jira/browse/HIVE-20113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17904833#comment-17904833
 ] 

Sungwoo Park commented on HIVE-20113:
-------------------------------------

For the record, by setting tez.runtime.pipelined-shuffle.enabled=false and 
tez.runtime.enable.final-merge.in.output=true, we can revert this commit and 
take advantage of one-to-one edges because every task is guaranteed to produce 
a single output file in the end.


> Shuffle avoidance: Disable 1-1 edges for sorted shuffle 
> --------------------------------------------------------
>
>                 Key: HIVE-20113
>                 URL: https://issues.apache.org/jira/browse/HIVE-20113
>             Project: Hive
>          Issue Type: Bug
>          Components: Tez
>            Reporter: Gopal Vijayaraghavan
>            Assignee: Vineet Garg
>            Priority: Major
>              Labels: Branch3Candidate
>             Fix For: 4.0.0-alpha-1
>
>         Attachments: HIVE-20113.1.patch, HIVE-20113.10.patch, 
> HIVE-20113.10.patch, HIVE-20113.2.patch, HIVE-20113.3.patch, 
> HIVE-20113.4.patch, HIVE-20113.4.patch, HIVE-20113.5.patch, 
> HIVE-20113.6.patch, HIVE-20113.7.patch, HIVE-20113.8.patch, HIVE-20113.9.patch
>
>
> The sorted shuffle avoidance can have some issues when the shuffle data gets 
> broken up into multiple chunks on disk.
> The 1-1 edge cannot skip the tez final merge - there's no reason for 1-1 to 
> have a final merge at all, it should open a single compressed file and write 
> a single index entry.
> Until the shuffle issue is resolved & a lot more testing, it is prudent to 
> disable the optimization for sorted shuffle edges and stop rewriting the 
> RS(sorted) = = = RS(sorted) into RS(sorted) = = = RS(FORWARD).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to