LazySimpleSerDe does not properly handle arrays / escape control characters ---------------------------------------------------------------------------
Key: HIVE-2333 URL: https://issues.apache.org/jira/browse/HIVE-2333 Project: Hive Issue Type: Bug Reporter: Jonathan Chang LazySimpleSerDe, the default SerDe for Hive is severely broken: * Empty arrays are serialized as an empty string. Hence an array(array()) is indistinguishable from array(array(array())) from array(). * Similarly, empty strings are serialized as an empty string. Hence array('') is also indistinguishable from an empty array. * if the serialized string equals the null sequence, then it is ambiguous as to whether it is an array with a single null element or a null array. It also does not do well with control characters: > select array('foo\002bar') from tmp; ... ["foo","bar"] > select array('foo\001bar') from tmp; ... ["foo"] -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira