[ https://issues.apache.org/jira/browse/HIVE-14511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15421763#comment-15421763 ]
Subramanyam Pattipaka edited comment on HIVE-14511 at 8/15/16 10:07 PM: ------------------------------------------------------------------------ [~sershe], Even if we introduce another command to be flexible to cater this scenario, what if the user data has changed in terms of directory structure. Why does the user has to recreate all tables again? Why not repair table is also flexible (with this patch) such that configs mapred.input.dir.recursive and hive.mapred.supports.subdirectories are supported add relevant partitions. Further having two commands may be confusing. I don't mean to add file here a=1/000000_0 f. I mean only to ignore these and list them in error log if a config is enabled such that users can act on them. Error is better instead of debug. This way, all configurations would give these details. For example if we have following files tbldir/a=1/file1.txt tbldir/a=2/b=1/file2.txt tbldir/a=2/b=1/c=1/file3.txt and we are trying to create partitioned table with partitions on a and b with root directory tbldir Here ERROR log would say ignoring file tbldir/a=1/file1.txt due to incorrect structure if ignore config is set. Otherwise, operation is failed. We add only one partition with values (2, 1). msck is still restrict and the ask here is to support configs mapred.input.dir.recursive and hive.mapred.supports.subdirectories. was (Author: pattipaka): [~sershe], Even if we introduce another command to be flexible to cater this scenario, what if the user data has changed in terms of directory structure. Why does the user has to recreate all tables again? Why not repair table is also flexible (with this patch) such that configs mapred.input.dir.recursive and hive.mapred.supports.subdirectories are supported add relevant partitions. Further having two commands may be confusing. I don't mean to add file here a=1/000000_0 f. I mean only to ignore these and list them in error log if a config is enabled such that users can act on them. Error is better instead of debug. This way, all configurations would give these details. For example if we have following files tbldir/a=1/file1.txt tbldir/a=2/b=1/file2.txt and we are trying to create partitioned table with partitions on a and b with root directory tbldir Here ERROR log would say ignoring file tbldir/a=1/file1.txt due to incorrect structure if ignore config is set. Otherwise, operation is failed. We add only one partition with values (2, 1). msck is still restrict and the ask here is to support configs mapred.input.dir.recursive and hive.mapred.supports.subdirectories. > Improve MSCK for partitioned table to deal with special cases > ------------------------------------------------------------- > > Key: HIVE-14511 > URL: https://issues.apache.org/jira/browse/HIVE-14511 > Project: Hive > Issue Type: Sub-task > Reporter: Pengcheng Xiong > Assignee: Pengcheng Xiong > Attachments: HIVE-14511.01.patch > > > Some users will have a folder rather than a file under the last partition > folder. However, msck is going to search for the leaf folder rather than the > last partition folder. We need to improve that. -- This message was sent by Atlassian JIRA (v6.3.4#6332)