[ https://issues.apache.org/jira/browse/HIVE-8062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14131174#comment-14131174 ]
Pengcheng Xiong commented on HIVE-8062: --------------------------------------- +1 > Stats collection for columns fails on a partitioned table with null values in > partitioning column > ------------------------------------------------------------------------------------------------- > > Key: HIVE-8062 > URL: https://issues.apache.org/jira/browse/HIVE-8062 > Project: Hive > Issue Type: Bug > Components: Statistics > Affects Versions: 0.14.0 > Reporter: Deepesh Khandelwal > Assignee: Ashutosh Chauhan > Attachments: HIVE-8062.patch > > > Steps to reproduce: > 1. Create a data file abc.txt with the following contents: > {noformat} > a,1 > b, > {noformat} > 2. Use the Hive CLI to create and load the partitioned table: > {noformat} > hive> create table abc(a string, b int); > OK > Time taken: 0.272 seconds > hive> load data local inpath 'abc.txt' into table abc; > Loading data to table default.abc > Table default.abc stats: [numFiles=1, numRows=0, totalSize=7, rawDataSize=0] > OK > Time taken: 0.463 seconds > hive> create table abc1(a string) partitioned by (b int); > OK > Time taken: 0.098 seconds > hive> set hive.exec.dynamic.partition.mode=nonstrict; > hive> insert overwrite table abc1 partition (b) select a, b from abc; > Query ID = hrt_qa_20140911210909_1200fae7-1e18-4e0d-b74f-040453c27cff > Total jobs = 1 > Launching Job 1 out of 1 > Status: Running (application id: Executing on YARN cluster with App id > application_1410457588978_0063) > Map 1: -/- Reducer 2: 0/1 > Map 1: 0/1 Reducer 2: 0/1 > Map 1: 0(+1)/1 Reducer 2: 0/1 > Map 1: 1/1 Reducer 2: 0(+1)/1 > Map 1: 1/1 Reducer 2: 0/1 > Map 1: 1/1 Reducer 2: 1/1 > Status: Finished successfully > Loading data to table default.abc1 partition (b=null) > Loading partition {b=__HIVE_DEFAULT_PARTITION__} > Partition default.abc1{b=__HIVE_DEFAULT_PARTITION__} stats: [numFiles=1, > numRows=2, totalSize=7, rawDataSize=5] > OK > Time taken: 7.49 seconds > {noformat} > 3. Now run the analyze statistics command for columns: > {noformat} > hive> analyze table abc1 partition (b) compute statistics for columns; > Query ID = hrt_qa_20140911211010_440bdb4a-6a0d-496b-9d2e-5fc84db3d0ee > Total jobs = 1 > Launching Job 1 out of 1 > Status: Running (application id: Executing on YARN cluster with App id > application_1410457588978_0063) > Map 1: 0(+1)/1 Reducer 2: 0/1 > Map 1: 1/1 Reducer 2: 0(+1)/1 > Map 1: 1/1 Reducer 2: 1/1 > Status: Finished successfully > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.ColumnStatsTask > {noformat} > The analyze statistics for columns fails. -- This message was sent by Atlassian JIRA (v6.3.4#6332)