[ https://issues.apache.org/jira/browse/HIVE-25025?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Work on HIVE-25025 started by WangHualei. ----------------------------------------- > Distcp In MoveTask may cause stats info lost > -------------------------------------------- > > Key: HIVE-25025 > URL: https://issues.apache.org/jira/browse/HIVE-25025 > Project: Hive > Issue Type: Bug > Components: Hive > Reporter: WangHualei > Assignee: WangHualei > Priority: Major > Original Estimate: 72h > Remaining Estimate: 72h > > after set _Run_ _as_ _end_ _user_ _instead_ _of_ _Hive_ _user_ , when > execute insert overwrite , In MoveTask ,if source byte > > HIVE_EXEC_COPYFILE_MAXSIZE and source file count> > HIVE_EXEC_COPYFILE_MAXNUMFILES , HIve will use distcp method, it may cause > tmp stats file lost. > example: > set hive.exec.copyfile.maxsize=0; > set hive.exec.copyfile.maxnumfiles=0; > insert overwrite table abc_new select * from abc; > select count(1) from abc_new ; > select * from abc_new ; > then the count(1) result will be 0, but select * will display real data, > because stats info lost. > -- This message was sent by Atlassian Jira (v8.3.4#803005)