[ https://issues.apache.org/jira/browse/HIVE-20113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17904833#comment-17904833 ]
Sungwoo Park commented on HIVE-20113: ------------------------------------- For the record, by setting tez.runtime.pipelined-shuffle.enabled=false and tez.runtime.enable.final-merge.in.output=true, we can revert this commit and take advantage of one-to-one edges because every task is guaranteed to produce a single output file in the end. > Shuffle avoidance: Disable 1-1 edges for sorted shuffle > -------------------------------------------------------- > > Key: HIVE-20113 > URL: https://issues.apache.org/jira/browse/HIVE-20113 > Project: Hive > Issue Type: Bug > Components: Tez > Reporter: Gopal Vijayaraghavan > Assignee: Vineet Garg > Priority: Major > Labels: Branch3Candidate > Fix For: 4.0.0-alpha-1 > > Attachments: HIVE-20113.1.patch, HIVE-20113.10.patch, > HIVE-20113.10.patch, HIVE-20113.2.patch, HIVE-20113.3.patch, > HIVE-20113.4.patch, HIVE-20113.4.patch, HIVE-20113.5.patch, > HIVE-20113.6.patch, HIVE-20113.7.patch, HIVE-20113.8.patch, HIVE-20113.9.patch > > > The sorted shuffle avoidance can have some issues when the shuffle data gets > broken up into multiple chunks on disk. > The 1-1 edge cannot skip the tez final merge - there's no reason for 1-1 to > have a final merge at all, it should open a single compressed file and write > a single index entry. > Until the shuffle issue is resolved & a lot more testing, it is prudent to > disable the optimization for sorted shuffle edges and stop rewriting the > RS(sorted) = = = RS(sorted) into RS(sorted) = = = RS(FORWARD). -- This message was sent by Atlassian Jira (v8.20.10#820010)