RE: Migrating Variable Length Files to Hive

2017-06-02 Thread Ryan Harris
Variable Length Files to Hive [External Email] Thanks Ryan. In my case I have around 200 small files in mainframe . Columns are same within a file but vary in number across files .Now I need to get all these data into a single hive table . The first three columns

Re: Migrating Variable Length Files to Hive

2017-06-02 Thread Nishanth S
Thanks Ryan. In my case I have around 200 small files in mainframe . Columns are same within a file but vary in number across files .Now I need to get all these data into a *single* hive table . The first three columns are standard in case of all files .Any idea how the schema would look if I

Re: Migrating Variable Length Files to Hive

2017-06-02 Thread Nishanth S
Thanks Edward . I am leaning towards using array .My nested data does not have a schema .It is a collection of strings and the number of strings can vary. On Fri, Jun 2, 2017 at 10:41 AM, Edward Capriolo wrote: > > > On Fri, Jun 2, 2017 at 12:07 PM, Nishanth S > wrote: > >> Hello hive users

RE: Migrating Variable Length Files to Hive

2017-06-02 Thread Ryan Harris
I wrote some custom python parsing scripts using StingRay Reader ( http://stingrayreader.sourceforge.net/cobol.html ) that read in the copybooks and use the results to automatically generate hive table schema based on the source copybook. The EBCDIC data is then extracted to TAB separated ASCII

Re: Migrating Variable Length Files to Hive

2017-06-02 Thread Gopal Vijayaraghavan
> We are looking at migrating  files(less than 5 Mb of data in total) with > variable record lengths from a mainframe system to hive. https://issues.apache.org/jira/browse/HIVE-10856 + https://github.com/rbheemana/Cobol-to-Hive/ came up on this list a while back. > Are there other alternative

Re: Migrating Variable Length Files to Hive

2017-06-02 Thread Edward Capriolo
On Fri, Jun 2, 2017 at 12:07 PM, Nishanth S wrote: > Hello hive users, > > We are looking at migrating files(less than 5 Mb of data in total) with > variable record lengths from a mainframe system to hive.You could think of > this as metadata.Each of these records can have columns ranging from