Thanks for letting me know about the StressSource. I'll give that a try. On Nov 9, 2012, at 8:10 AM, Brock Noland <br...@cloudera.com> wrote:
> Hi, > > For performance testing I highly recommend > org.apache.flume.source.StressSource > > Perhaps try that? > > Brock > > On Thu, Nov 8, 2012 at 7:43 PM, Pankaj Gupta <pan...@brightroll.com> wrote: > Hi, > > What is the throughput I can expect when writing to the HDFS Sink. Here is > the flume config I'm using: > > # in this case called 'agent' > > # Define a memory channel called ch1 on agent1 > agent1.channels.ch1.type = memory > > # Define an exec source called exec-source1 on agent1 and tell it > # to bind to 0.0.0.0:41414. Connect it to channel ch1. > agent1.sources.exec-source1.channels = ch1 > agent1.sources.exec-source1.type = exec > agent1.sources.exec-source1.restart = true > agent1.sources.exec-source1.batchSize = 100 > agent1.sources.exec-source1.command = /home/ubuntu/flume/linesource.sh > > # Define a logger sink that simply logs all events it receives > # and connect it to the other end of the same channel. > agent1.sinks.hdfs-sink1.channel = ch1 > agent1.sinks.hdfs-sink1.type = hdfs > agent1.sinks.hdfs-sink1.hdfs.path = > hdfs://ip-10-000-000-000.ec2.internal/user/ubuntu/event > agent1.sinks.hdfs-sink1.hdfs.filePrefix = event > agent1.sinks.hdfs-sink1.hdfs.writeFormat = Text > agent1.sinks.hdfs-sink1.hdfs.rollInterval = 60 > agent1.sinks.hdfs-sink1.hdfs.rollCount = 0 > agent1.sinks.hdfs-sink1.hdfs.rollSize = 0 > agent1.sinks.hdfs-sink1.hdfs.fileType = DataStream > agent1.sinks.hdfs-sink1.hdfs.batchSize = 1000 > > # Finally, now that we've defined all of our components, tell > # agent1 which ones we want to activate. > agent1.channels = ch1 > agent1.sources = exec-source1 > agent1.sinks = hdfs-sink1 > > > So far I only get about 20Mb/min or less than 1 Mb/sec. I am wondering how > far it can be improved. Is there any Benchmark on HDFS Sink performance. > > Thanks in Advance, > Pankaj > > > > > > -- > Apache MRUnit - Unit testing MapReduce - http://incubator.apache.org/mrunit/