[ https://issues.apache.org/jira/browse/HIVE-12541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
chenliang reassigned HIVE-12541: -------------------------------- Assignee: chenliang (was: Xiaowei Wang) > SymbolicTextInputFormat should supports the path with regex > ----------------------------------------------------------- > > Key: HIVE-12541 > URL: https://issues.apache.org/jira/browse/HIVE-12541 > Project: Hive > Issue Type: Improvement > Affects Versions: 0.14.0, 1.2.0, 1.2.1 > Reporter: Xiaowei Wang > Assignee: chenliang > Priority: Major > Fix For: 2.1.0 > > Attachments: HIVE-12541.1.patch, HIVE-12541.2.patch, > HIVE-12541.3.patch, HIVE-12541.4.patch > > > 1, In fact,SybolicTextInputFormat supports the path with regex .I add some > test sql . > 2, But ,when using CombineHiveInputFormat to combine input files , It cannot > resolve the path with regex ,so it will get a wrong result.I give a example > ,and fix the problem. > Table desc : > {noformat} > CREATE External TABLE `symlink_text_input_format`( > `key` string, > `value` string) > ROW FORMAT SERDE > 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' > STORED AS INPUTFORMAT > 'org.apache.hadoop.hive.ql.io.SymlinkTextInputFormat' > OUTPUTFORMAT > 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' > LOCATION > 'viewfs://nsX/user/hive/warehouse/symlink_text_input_format' > {noformat} > There is a link file in the dir > '/user/hive/warehouse/symlink_text_input_format' , the content of the link > file is > {noformat} > viewfs://nsx/tmp/symlink* > {noformat} > it contains one path ,and the path contains a regex! > Execute the sql : > {noformat} > set hive.rework.mapredwork = true ; > set hive.input.format=org.apache.hadoop.hive.ql.io.CombineHiveInputFormat; > set mapred.min.split.size.per.rack= 0 ; > set mapred.min.split.size.per.node= 0 ; > set mapred.max.split.size= 0 ; > select count(*) from symlink_text_input_format ; > {noformat} > It will get a wrong result :0 -- This message was sent by Atlassian JIRA (v7.6.3#76005)