Try CSV serde. It should correctly parse quoted field value having newline
inside
https://cwiki.apache.org/confluence/display/Hive/CSV+Serde

Hadoop should automatically read bz2 files


On Tue, Jan 12, 2016 at 9:40 AM, Gerber, Bryan W <bryan.ger...@pnnl.gov>
wrote:

> We are attempting to load CSV text files (compressed to bz2) containing
> newlines in fields using EXTERNAL tables and INSERT/SELECT into ORC format
> tables.  Data volume is ~1TB/day, we are really trying to avoid unpacking
> them to condition the data.
>
>
>
> A few days of research has us ready to implement custom  input/output
> formats to handle the ingest.  Any other suggestions that may be less
> effort with low impact to load times?
>
>
>
> Thanks,
>
> Bryan G.
>

Reply via email to