[ https://issues.apache.org/jira/browse/HIVE-11267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14629069#comment-14629069 ]
Chengxiang Li commented on HIVE-11267: -------------------------------------- [~xuefuz], I took a look at the FileSinkOperator implementation before, the write logic is quite complicated, and write multi times would break several its design rules. I don't want to change FileSinkOperator a lot for this special case optimization. Fetch twice would be just few lines of code change and more efficient(SparkWork only write once). Actually we can check the exists of FetchTask, if it does not exist, we can skip this optimization. > Combine equavilent leaf works in SparkWork[Spark Branch] > -------------------------------------------------------- > > Key: HIVE-11267 > URL: https://issues.apache.org/jira/browse/HIVE-11267 > Project: Hive > Issue Type: Sub-task > Components: Spark > Reporter: Chengxiang Li > Assignee: Chengxiang Li > Priority: Minor > > There could be multi leaf works in SparkWork, like self-union query. If the > subqueries are same with each other, we may combine the subqueries, and just > execute once, then fetch twice in FetchTask. -- This message was sent by Atlassian JIRA (v6.3.4#6332)