[ https://issues.apache.org/jira/browse/HIVE-1898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13449978#comment-13449978 ]
Brian Bloniarz commented on HIVE-1898: -------------------------------------- I think Luke is right -- maybe the bug title should be changed to simply say "data with newlines won't work in Text/LazySimpleSerDe tables"? I haven't tested it, but would STORED AS SEQUENCEFILE tables be immune to this problem? > The ESCAPED BY clause does not seem to pick up newlines in colums and the > line terminator cannot be changed > ----------------------------------------------------------------------------------------------------------- > > Key: HIVE-1898 > URL: https://issues.apache.org/jira/browse/HIVE-1898 > Project: Hive > Issue Type: Bug > Components: Serializers/Deserializers > Affects Versions: 0.5.0 > Reporter: Josh Patterson > Priority: Minor > > If I want to preserve data in columns which contains a newline (webcrawling > for instance) I cannot set the ESCAPED BY clause to escape these out (other > characters such as commas escape fine, however). This may be due to the line > terminators, which are locked to be newlines, are picked up first, and then > fields processed. > This seems to be related to: > "SerDe should escape some special characters" > https://issues.apache.org/jira/browse/HIVE-136 > and > "Implement "LINES TERMINATED BY"" > https://issues.apache.org/jira/browse/HIVE-302 > where at comment: > https://issues.apache.org/jira/browse/HIVE-302?focusedCommentId=12793435&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12793435 > "This is not fixable currently because the line terminator is determined by > LineRecordReader.LineReader which is in the Hadoop land." -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira