[ 
https://issues.apache.org/jira/browse/HIVE-17545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16180152#comment-16180152
 ] 

Rui Li commented on HIVE-17545:
-------------------------------

[~kellyzly], if you apply two different transformations to an RDD, that RDD 
will be evaluated twice when we compute the child RDDs. To avoid this, you need 
to cache the RDD. So if we combine equivalent works w/o caching them, then we 
can't get rid of duplicated computations. The descriptions of HIVE-10550 and 
HIVE-10844 also mentioned how combing works depend on RDD caching.

> Make HoS RDD Cacheing Optimization Configurable
> -----------------------------------------------
>
>                 Key: HIVE-17545
>                 URL: https://issues.apache.org/jira/browse/HIVE-17545
>             Project: Hive
>          Issue Type: Improvement
>          Components: Physical Optimizer, Spark
>            Reporter: Sahil Takiar
>            Assignee: Sahil Takiar
>         Attachments: HIVE-17545.1.patch, HIVE-17545.2.patch
>
>
> The RDD cacheing optimization add in HIVE-10550 is enabled by default. We 
> should make it configurable in case users want to disable it. We can leave it 
> on by default to preserve backwards compatibility.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to