Hi

I’m not sure how this will solve the issue you were mentioned, but just for the 
fun of it –
Here is the code.

Dudu


set textinputformat.record.delimiter='\0';
set hive.mapred.supports.subdirectories=true;
set mapred.input.dir.recursive=true;
create external table if not exists files_ext (txt string) stored as textfile 
location '/tmp/t';
create table if not exists files (key string,val string) stored as sequencefile;
insert into files select input__file__name,* from files_ext;
select key,length (val),regexp_extract (val,'(.*)\n',1) as val_first_line from 
files;

hdfs://quickstart.cloudera:8020/tmp/t/t1/t3/t4/xx01

447

Ring-ding-ding-ding-dingeringeding!

hdfs://quickstart.cloudera:8020/tmp/t/t1/t3/t4/xx02

364

Big blue eyes, pointy nose, chasing mice, and digging holes.

hdfs://quickstart.cloudera:8020/tmp/t/t1/t3/t5/xx03

321

Jacha-chacha-chacha-chow!

hdfs://quickstart.cloudera:8020/tmp/t/t1/t3/xx00

256

Dog goes woof, cat goes meow.

hdfs://quickstart.cloudera:8020/tmp/t/t2/xx05

258

You're my guardian angel hiding in the woods.

hdfs://quickstart.cloudera:8020/tmp/t/xx04

171

The secret of the fox, ancient mystery.


From: Arun Patel [mailto:arunp.bigd...@gmail.com]
Sent: Friday, September 23, 2016 7:04 PM
To: user@hive.apache.org
Subject: HDFS small files to Sequence file using Hive

I'm trying to resolve small files issue using Hive.

Is there a way to create an external table on a directory, extract 'key' as 
file name and 'value' as file content and write to a sequence file table?

Or any other better option in Hive?

Thank you

Arun

Reply via email to