[ 
https://issues.apache.org/jira/browse/HIVE-28609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-28609:
----------------------------------
    Labels: pull-request-available  (was: )

> HiveSequenceFileInputFormat should be cloned or not be cached
> -------------------------------------------------------------
>
>                 Key: HIVE-28609
>                 URL: https://issues.apache.org/jira/browse/HIVE-28609
>             Project: Hive
>          Issue Type: Improvement
>      Security Level: Public(Viewable by anyone) 
>            Reporter: László Bodor
>            Priority: Major
>              Labels: pull-request-available
>
> HIVE-28530 introduces a ThreadLocal for storing files in 
> HiveSequenceFileInputFormat because there was a contention while accessing 
> the files in a shared/cached instance. I feel we fixed a problem in a bad 
> place. Instead of preventing this instance from being cached, it introduced a 
> ThreadLocal, which seems weird and hacky and makes the code reader think that 
> the input format instance must be cached, whereas it's not. This format class 
> is instantiated by 
> [reflection|https://github.com/apache/hive/blob/18f34e75da0141d37d9a8f1cef4f7f64ba09fadb/ql/src/java/org/apache/hadoop/hive/ql/exec/FetchOperator.java#L229],
>  which is quite often cached due to performance reasons. We can still cache 
> an instance and clone it (maybe by implementing some interface) to keep 
> performance.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to