RE: Small file problem and GenMRFileSink1

2011-06-30 Thread David Ginzburg
What if it isn't a hive table? just an hdfs folder? can I create a temporary folder and then merge or somehow use the API that invokes the merge job ? > From: ginz...@hotmail.com > To: dev@hive.apache.org > Subject: Small file problem and GenMRFileSink1 > Date: Wed, 29 Jun 20

Re: Small file problem and GenMRFileSink1

2011-06-30 Thread Ning Zhang
If you are using hive trunk and your table is stored in RCFile format, you can run alter table src_rc_merge_test concatenate; On Jun 30, 2011, at 9:53 AM, David Ginzburg wrote: > > > Hi, > I'm not sure weather this belongs in the hive-dev or hive-user. > I have a folder with many small file

FW: Small file problem and GenMRFileSink1

2011-06-30 Thread David Ginzburg
Hi, I'm not sure weather this belongs in the hive-dev or hive-user. I have a folder with many small files. I would like to reduce the number of files the way hive merges output . I tried to understand from the source of org.apache.hadoop.hive.ql.optimizer.GenMRFileSink1 how to leverage the

Small file problem and GenMRFileSink1

2011-06-29 Thread David Ginzburg
Hi, I'm not sure weather this belongs in the hive-dev or hive-user. I have a folder with many small files. I would like to reduce the number of files the way hive merges output . I tried to understand from the source of org.apache.hadoop.hive.ql.optimizer.GenMRFileSink1 how to leverage the API