I wrote some custom python parsing scripts using StingRay Reader ( 
http://stingrayreader.sourceforge.net/cobol.html ) that read in the copybooks 
and use the results to automatically generate hive table schema based on the 
source copybook.  The EBCDIC data is then extracted to TAB separated ASCII 
values to load to Hive.
Some tables had some very sparse column values, so in those cases, I bundled 
the sparse data into a catch-all JSON field in the Hive table.

The parser is able to handle both fixed-length records as well as 
variable-length VB-type records.

Let me know if you have any questions regarding Stingray….

From: Nishanth S [mailto:nishanth.2...@gmail.com]
Sent: Friday, June 02, 2017 10:07 AM
To: user@hive.apache.org
Subject: Migrating Variable Length Files to Hive

[External Email]
________________________________
Hello hive users,

We are looking at migrating  files(less than 5 Mb of data in total) with 
variable record lengths from a mainframe system to hive.You could think of this 
as metadata.Each of these records can have columns  ranging from 3 to  n( means 
 each record type have different number of columns) based on record type.What 
would be the best strategy to migrate this  to hive .I was thinking of 
converting these files  into one  variable length csv file and then importing 
them to a hive table .Hive table will consist of 4 columns with the 4th column 
having comma separated list of  values from column column 4 to n.Are there 
other alternative or better approaches for this solution.Appreciate any  
feedback on this.

Thanks,
Nishanth

======================================================================
THIS ELECTRONIC MESSAGE, INCLUDING ANY ACCOMPANYING DOCUMENTS, IS CONFIDENTIAL 
and may contain information that is privileged and exempt from disclosure under 
applicable law. If you are neither the intended recipient nor responsible for 
delivering the message to the intended recipient, please note that any 
dissemination, distribution, copying or the taking of any action in reliance 
upon the message is strictly prohibited. If you have received this 
communication in error, please notify the sender immediately.  Thank you.

Reply via email to