How is your data partitioned, by date? On Monday, January 20, 2014, Chen Wang <chen.apache.s...@gmail.com> wrote:
> Guys, > I have flume setup to flow partitioned data to hdfs, each partition has > its own file folder. Is there a way to specify all the data under one > partition to be in one file? > I am currently using > MyAgent.sinks.HDFS.hdfs.batchSize = 10000 > MyAgent.sinks.HDFS.hdfs.rollSize = 15000000 > MyAgent.sinks.HDFS.hdfs.rollCount = 10000 > MyAgent.sinks.HDFS.hdfs.rollInterval = 360 > > to make the file roll on 15m data or after 6 minute. > > Is this the best way to achieve my goal? > Thanks, > Chen > >