Thanks Edward . I am leaning towards using array .My nested data does not have a schema .It is a collection of strings and the number of strings can vary.
On Fri, Jun 2, 2017 at 10:41 AM, Edward Capriolo <edlinuxg...@gmail.com> wrote: > > > On Fri, Jun 2, 2017 at 12:07 PM, Nishanth S <nishanth.2...@gmail.com> > wrote: > >> Hello hive users, >> >> We are looking at migrating files(less than 5 Mb of data in total) with >> variable record lengths from a mainframe system to hive.You could think of >> this as metadata.Each of these records can have columns ranging from 3 to >> n( means each record type have different number of columns) based on >> record type.What would be the best strategy to migrate this to hive .I was >> thinking of converting these files into one variable length csv file and >> then importing them to a hive table .Hive table will consist of 4 columns >> with the 4th column having comma separated list of values from column >> column 4 to n.Are there other alternative or better approaches for this >> solution.Appreciate any feedback on this. >> >> Thanks, >> Nishanth >> > > Hive supports complex types like List, Map, and Struct and they can be > arbitrarily nested. If the nested data has a schema that may be your best > option. Potentially using thrift/avro/parquet/protobuf support. > > Otherwise you can store the data as Json and at read time parse things out > using json udfs. > > Edward >