Sergey Shelukhin created HIVE-10722:
---------------------------------------

             Summary: external table creation in Hive can create unusable 
partition
                 Key: HIVE-10722
                 URL: https://issues.apache.org/jira/browse/HIVE-10722
             Project: Hive
          Issue Type: Bug
            Reporter: Sergey Shelukhin
            Assignee: Sergey Shelukhin
            Priority: Critical


There can be directories in HDFS containing unprintable characters; when doing 
hadoop fs -ls, these are not even visible, and can only be seen for example if 
output is piped thru od.
When these are loaded, they are stored in e.g. mysql as "?" (literal question 
mark, findable via LIKE '%?%' in db) and show accordingly in Hive.
However, datanucleus appears to encode it as %3F; this causes the partition to 
be unusable - it cannot be dropped, and other operations like drop table get 
stuck (didn't investigate in detail why; drop table got unstuck as soon as 
partition is removed from metastore).

We should probably have a 2-way option for such cases - error out on load 
(default), or convert to '?' (that should actually work)/drop such characters.

We should also check if partitions with '?' inserted explicitly work at all 
with datanucleus.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to