What if it isn't a hive table? just an hdfs folder? can I create a temporary folder and then merge or somehow use the API that invokes the merge job ?
> From: ginz...@hotmail.com > To: dev@hive.apache.org > Subject: Small file problem and GenMRFileSink1 > Date: Wed, 29 Jun 2011 15:33:44 +0000 > > > > > > Hi, > I'm not sure weather this belongs in the hive-dev or hive-user. > I have a folder with many small files. > I would like to reduce the number of files the way hive merges output . > I tried to understand from the source of > org.apache.hadoop.hive.ql.optimizer.GenMRFileSink1 how to leverage the API to > submit a job > that merges output files. > I think I was able to identify: > private void createMergeJob(FileSinkOperator fsOp, GenMRProcContext ctx, > String finalName) > throws SemanticException > As the entry point to the logic that performs the operation, but I did not > find documentation as to how to use it > > Is there an example that simulates the use of this API call? > > > >