[
https://issues.apache.org/jira/browse/HIVE-1898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13449978#comment-13449978
]
Brian Bloniarz commented on HIVE-1898:
--------------------------------------
I think Luke is right -- maybe the bug title should be changed to simply say
"data with newlines won't work in Text/LazySimpleSerDe tables"?
I haven't tested it, but would STORED AS SEQUENCEFILE tables be immune to this
problem?
> The ESCAPED BY clause does not seem to pick up newlines in colums and the
> line terminator cannot be changed
> -----------------------------------------------------------------------------------------------------------
>
> Key: HIVE-1898
> URL: https://issues.apache.org/jira/browse/HIVE-1898
> Project: Hive
> Issue Type: Bug
> Components: Serializers/Deserializers
> Affects Versions: 0.5.0
> Reporter: Josh Patterson
> Priority: Minor
>
> If I want to preserve data in columns which contains a newline (webcrawling
> for instance) I cannot set the ESCAPED BY clause to escape these out (other
> characters such as commas escape fine, however). This may be due to the line
> terminators, which are locked to be newlines, are picked up first, and then
> fields processed.
> This seems to be related to:
> "SerDe should escape some special characters"
> https://issues.apache.org/jira/browse/HIVE-136
> and
> "Implement "LINES TERMINATED BY""
> https://issues.apache.org/jira/browse/HIVE-302
> where at comment:
> https://issues.apache.org/jira/browse/HIVE-302?focusedCommentId=12793435&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12793435
> "This is not fixable currently because the line terminator is determined by
> LineRecordReader.LineReader which is in the Hadoop land."
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira