We are attempting to load CSV text files (compressed to bz2) containing 
newlines in fields using EXTERNAL tables and INSERT/SELECT into ORC format 
tables.  Data volume is ~1TB/day, we are really trying to avoid unpacking them 
to condition the data.

A few days of research has us ready to implement custom  input/output formats 
to handle the ingest.  Any other suggestions that may be less effort with low 
impact to load times?

Thanks,
Bryan G.

Reply via email to