[ https://issues.apache.org/jira/browse/HIVE-9201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14264965#comment-14264965 ]
Yongzhi Chen commented on HIVE-9201: ------------------------------------ [~ashutoshgupt...@gmail.com], Are you trying to say we start to Implement "LINES TERMINATED BY" for hive? It is treated as not fixable by https://issues.apache.org/jira/browse/HIVE-302 In current hive code, it seems we just error out the line terminator other than \n, and many places just assume the \n is the only line terminator. case HiveParser.TOK_TABLEROWFORMATLINES: String lineDelim = unescapeSQLString(rowChild.getChild(0).getText()); tblDesc.getProperties().setProperty(serdeConstants.LINE_DELIM, lineDelim); if (!lineDelim.equals("\n") && !lineDelim.equals("10")) { throw new SemanticException(generateErrorMessage(rowChild, ErrorMsg.LINES_TERMINATED_BY_NON_NEWLINE.getMsg())); } break; But with MAPREDUCE-2602 fixed, it is possible for hive to support changing the line terminator. Just wonder it may not be a easy change. Thanks. > Lazy functions do not handle newlines and carriage returns properly > ------------------------------------------------------------------- > > Key: HIVE-9201 > URL: https://issues.apache.org/jira/browse/HIVE-9201 > Project: Hive > Issue Type: Bug > Affects Versions: 0.14.0, 0.13.1 > Reporter: Yongzhi Chen > Assignee: Yongzhi Chen > Attachments: HIVE-9201.1.patch > > > Hive returns wrong result when returning string has char \r or \n in it. > This happens when the query can trigger mapreduce jobs. > For example, for a table named strsim with only one row: > As shown following, query 1 returns 1 row while query 2 returns 3 rows. > Query 1: > select "abc", narray from strsim LATERAL VIEW explode(array(1)) C AS narray; > Query 2: > select "a\rb\nc", narray from strsim LATERAL VIEW explode(array(1)) C AS > narray; > select "abc", narray from strsim LATERAL VIEW e > xplode(array(1)) C AS narray; > INFO : Number of reduce tasks is set to 0 since there's no reduce operator > INFO : Job running in-process (local Hadoop) > INFO : 2014-12-23 15:00:08,958 Stage-1 map = 0%, reduce = 0% > INFO : Ended Job = job_local1178499218_0015 > +------+---------+--+ > 1 row selected (1.283 seconds) > | _c0 | narray | > +------+---------+--+ > | abc | 1 | > +------+---------+--+ > select "a\rb\nc", narray from strsim LATERAL VI > EW explode(array(1)) C AS narray; > INFO : Number of reduce tasks is set to 0 since there's no reduce operator > INFO : Job running in-process (local Hadoop) > INFO : 2014-12-23 15:04:35,441 Stage-1 map = 0%, reduce = 0% > INFO : Ended Job = job_local1816711099_0016 > +------+---------+--+ > 3 rows selected (1.135 seconds) > | _c0 | narray | > +------+---------+--+ > | a | NULL | > | b | NULL | > | c | 1 | > +------+---------+--+ -- This message was sent by Atlassian JIRA (v6.3.4#6332)