[ 
https://issues.apache.org/jira/browse/HIVE-11267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14629069#comment-14629069
 ] 

Chengxiang Li commented on HIVE-11267:
--------------------------------------

[~xuefuz], I took a look at the FileSinkOperator implementation before, the 
write logic is quite complicated, and write multi times would break several its 
design rules. I don't want to change FileSinkOperator a lot for this special 
case optimization. Fetch twice would be just few lines of code change and more 
efficient(SparkWork only write once). Actually we can check the exists of 
FetchTask, if it does not exist, we can skip this optimization.


> Combine equavilent leaf works in SparkWork[Spark Branch]
> --------------------------------------------------------
>
>                 Key: HIVE-11267
>                 URL: https://issues.apache.org/jira/browse/HIVE-11267
>             Project: Hive
>          Issue Type: Sub-task
>          Components: Spark
>            Reporter: Chengxiang Li
>            Assignee: Chengxiang Li
>            Priority: Minor
>
> There could be multi leaf works in SparkWork, like self-union query. If the 
> subqueries are same with each other, we may combine the subqueries, and just 
> execute once, then fetch twice in FetchTask.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to