Incorrect regular expression for extracting task id from filename -----------------------------------------------------------------
Key: HIVE-2309 URL: https://issues.apache.org/jira/browse/HIVE-2309 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.7.1 Reporter: Paul Yang Priority: Minor For producing the correct filenames for bucketed tables, there is a method in Utilities.java that extracts out the task id from the filename and replaces it with the bucket number. There is a bug in the regex that is used to extract this value for attempt numbers >= 10: {code} >>> re.match("^.*?([0-9]+)(_[0-9])?(\\..*)?$", >>> 'attempt_201107090429_64965_m_001210_10').group(1) '10' >>> re.match("^.*?([0-9]+)(_[0-9])?(\\..*)?$", >>> 'attempt_201107090429_64965_m_001210_9').group(1) '001210' {code} -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira