Incorrect regular expression for extracting task id from filename
-----------------------------------------------------------------

                 Key: HIVE-2309
                 URL: https://issues.apache.org/jira/browse/HIVE-2309
             Project: Hive
          Issue Type: Bug
          Components: Query Processor
    Affects Versions: 0.7.1
            Reporter: Paul Yang
            Priority: Minor


For producing the correct filenames for bucketed tables, there is a method in 
Utilities.java that extracts out the task id from the filename and replaces it 
with the bucket number. There is a bug in the regex that is used to extract 
this value for attempt numbers >= 10:

{code}
>>> re.match("^.*?([0-9]+)(_[0​-9])?(\\..*)?$", 
>>> 'attempt_201107090429_6496​5_m_001210_10').group(1)
'10'
>>> re.match("^.*?([0-9]+)(_[0​-9])?(\\..*)?$", 
>>> 'attempt_201107090429_6496​5_m_001210_9').group(1)
'001210'
{code}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


Reply via email to