Marcin is correct : either split up the gzip files in smaller files of at least on HDFS block or use bzip2 with block compression. What is the original format of the table?
> On 22 Jun 2016, at 01:50, Marcin Tustin <mtus...@handybook.com> wrote: > > This is because a GZ file is not splittable at all. Basically, try creating > this from an uncompressed file, or even better split up the file and put the > files in a directory in hdfs/s3/whatever. > >> On Tue, Jun 21, 2016 at 7:45 PM, @Sanjiv Singh <sanjiv.is...@gmail.com> >> wrote: >> Hi , >> >> I have big compressed data file my_table.dat.gz ( approx size 100 GB) >> >> # load staging table STAGE_my_table from file my_table.dat.gz >> >> HIVE>> LOAD DATA INPATH '/var/lib/txt/my_table.dat.gz' OVERWRITE INTO TABLE >> STAGE_my_table ; >> >> # insert into ORC table "my_table" >> >> HIVE>> INSERT INTO TABLE my_table SELECT * FROM TXT_my_table; >> .... >> INFO : Map 1: 0(+1)/1 Reducer 2: 0/1 >> .... >> >> >> Insertion into orc table in going on since 5-6 hours , Seems everything is >> going sequential with one mapper reading complete file? >> >> Please suggest ? help me in improving ORC table load. >> >> >> >> >> Regards >> Sanjiv Singh >> Mob : +091 9990-447-339 > > > Want to work at Handy? Check out our culture deck and open roles > Latest news at Handy > Handy just raised $50m led by Fidelity >