[ 
https://issues.apache.org/jira/browse/HIVE-7873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14169837#comment-14169837
 ] 

Xuefu Zhang commented on HIVE-7873:
-----------------------------------

Re: explanation for the numbers

Higher numbers are all caused by dorminant disk access. #1 causes data spill 
because data is not emitted until close() call. The spill happens outside 
RowContainer, though. 

When there is no lazy exec, all rows are produced before it can be consumed. 
Thus, all numbers are high. The same theory applied to lazy exec with disk 
spill ( #3 and #5 in the second set). 

The true synchronization cost is the diff between #2 and #4 in the second set, 
which seems acceptable.

> Re-enable lazy HiveBaseFunctionResultList
> -----------------------------------------
>
>                 Key: HIVE-7873
>                 URL: https://issues.apache.org/jira/browse/HIVE-7873
>             Project: Hive
>          Issue Type: Sub-task
>            Reporter: Brock Noland
>            Assignee: Jimmy Xiang
>              Labels: Spark-M4, spark
>         Attachments: HIVE-7873.1-spark.patch, HIVE-7873.2-spark.patch
>
>
> We removed this optimization in HIVE-7799.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to