Re: READING STRING, CONTAINS \R\N, FROM ORC FILES VIA JDBC DRIVER PRODUCES DIRTY DATA

Owen O'Malley Thu, 02 Nov 2017 08:21:58 -0700

ORC stores the data in UTF-8 with the length of the value stored
explicitly. Therefore, it doesn't do any parsing of newlines.


You can see the contents of an ORC file by using:

% hive --orcfiledump -d <path_to_file>

from https://orc.apache.org/docs/hive-ddl.html . How did you load the data
into Hive?

... Owen

On Thu, Nov 2, 2017 at 5:29 AM, Залеский Александр Андреевич <
aazal...@mts.ru> wrote:

> My problem is to read data with “newline” character from ORC via jdbc.
> Standard behavior for reading string – split row for every newline symbol,
> and that seems like a bug. Why I couldn’t store any symbols in my data? Why
> jdbc read them as control symbols? I have created issue to terradata (
> https://tays.teradata.com/home/?language=en_US&aidIncidentId=RECHDBRVV)
> and they give me advice to write own SerDe. Perhaps, that is not unique
> task, and you already wrote such SerDe, can I ask for it?
>

Re: READING STRING, CONTAINS \R\N, FROM ORC FILES VIA JDBC DRIVER PRODUCES DIRTY DATA

Reply via email to