[ https://issues.apache.org/jira/browse/HIVE-2333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13702630#comment-13702630 ]
Jonathan Chang commented on HIVE-2333: -------------------------------------- Fair enough. I'd argue putting together a deprecation plan for lazy simple though since IMO the default serde should just work. > LazySimpleSerDe does not properly handle arrays / escape control characters > --------------------------------------------------------------------------- > > Key: HIVE-2333 > URL: https://issues.apache.org/jira/browse/HIVE-2333 > Project: Hive > Issue Type: Bug > Reporter: Jonathan Chang > Priority: Critical > > LazySimpleSerDe, the default SerDe for Hive is severely broken: > * Empty arrays are serialized as an empty string. Hence an array(array()) is > indistinguishable from array(array(array())) from array(). > * Similarly, empty strings are serialized as an empty string. Hence array('') > is also indistinguishable from an empty array. > * if the serialized string equals the null sequence, then it is ambiguous as > to whether it is an array with a single null element or a null array. > It also does not do well with control characters: > > select array('foo\002bar') from tmp; > ... > ["foo","bar"] > > select array('foo\001bar') from tmp; > ... > ["foo"] -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira