Re: Hadoop streaming with insert dynamic partition generate many small files

2014-02-03 Thread Chen Wang
nderstand how the > columns (A.column1,A to key, value) converted? Also can you point me > to some documents to read more about it. > > Thanks, > > Chandra > > > > > > *From:* Chen Wang [mailto:chen.apache.s...@gmail.com] > *Sent:* Monday, February 03,

Re: Hadoop streaming with insert dynamic partition generate many small files

2014-02-02 Thread Chen Wang
it seems that hive.exec.reducers.bytes.per.reducer is still not big enough: I added another 0, and now i only gets one file under each partition. On Sun, Feb 2, 2014 at 10:14 PM, Chen Wang wrote: > Hi, > I am using java reducer reading from a table, and then write to another > one: &

Hadoop streaming with insert dynamic partition generate many small files

2014-02-02 Thread Chen Wang
Hi, I am using java reducer reading from a table, and then write to another one: FROM ( FROM ( SELECT column1,... FROM table1 WHERE ( partition > 6 and partition < 12 ) ) A MAP A.co

Flume data to hive

2014-01-14 Thread Chen Wang
Hey guys, I am using flume to directly sink data into my hive table. However, there seems to be some schema inconsistency, and I am not sure how to troubleshoot it. I created a hive table 'targeting' in hive, it use sequence file, snappy compression, partitioned by 'epoch'. After the table is crea

Re: Help on loading data stream to hive table.

2014-01-07 Thread Chen Wang
k. > > Alan. > > On Jan 6, 2014, at 6:26 PM, Chen Wang wrote: > > > Alan, > > the problem is that the data is partitioned by epoch ten hourly, and i > want all data belong to that partition to be written into one file named > with that partition. How can i share the f

Re: Help on loading data stream to hive table.

2014-01-06 Thread Chen Wang
You can then close the files every 15 minutes (or whatever works for you) > and have a separate job that creates a new partition in your Hive table > with the files created by your bolts. > > Alan. > > On Jan 2, 2014, at 11:58 AM, Chen Wang wrote: > > > Guys, > >

Help on loading data stream to hive table.

2014-01-02 Thread Chen Wang
Guys, I am using storm to read data stream from our socket server, entry by entry, and then write them to file: one entry per file. At some point, i need to import the data into my hive table. There are several approaches i could think of: 1. directly write to hive hdfs file whenever I get the ent