Yep. The disk space calls require an NFS call for each write, and that slows things down a lot.
-- Hari Shreedharan On Tuesday, December 18, 2012 at 8:43 AM, Brock Noland wrote: > We'd need those thread dumps to help confirm but I bet that FLUME-1609 > results in a NFS call on each operation on the channel. > > If that is true, that would explain why it works well on local disk. > > Brock > > On Tue, Dec 18, 2012 at 10:17 AM, Brock Noland <br...@cloudera.com > (mailto:br...@cloudera.com)> wrote: > > Hi, > > > > Hmm, yes in general performance is not going to be great over NFS, but > > there haven't been any FC changes that stick out here. > > > > Could you take 10 thread dumps of the agent running the file channel > > and 10 thread dumps of the agent sending data to the agent with the > > file channel? (You can address them to myself directly since the list > > won't take attachements.) > > > > Are there any patterns, like it works for 40 seconds then times out > > and then works for 39 seconds, etc? > > > > Brock > > > > On Tue, Dec 18, 2012 at 10:07 AM, Rakos, Rudolf > > <rudolf.ra...@morganstanley.com (mailto:rudolf.ra...@morganstanley.com)> > > wrote: > > > Hi, > > > > > > > > > > > > We’ve run into a strange problem regarding NFS and File Channel > > > performance > > > while evaluating the new version of Flume. > > > > > > We had no issues with the previous version (1.2.0). > > > > > > > > > > > > Our configuration looks like this: > > > > > > · Node1: > > > (Avro RPC Clients ->) Avro Source and Custom Sources -> File Channel -> > > > Avro > > > Sink (-> Node 2) > > > > > > · Node2: > > > (Node1s ->) Avro Source -> File Channel -> Custom Sink > > > > > > > > > > > > Both the checkpoint and the data directories of the File Channels are on > > > NFS > > > shares. We use the same share for checkpoint and data directories, but > > > different shares for each Node. Unfortunately it is not an option for us > > > to > > > use local directories. > > > > > > The events are about 1KB large, and the batch sizes are the following: > > > > > > · Avro RPC Clients: 1000 > > > > > > · Custom Sources: 2000 > > > > > > · Avro Sink: 5000 > > > > > > · Custom Sink: 10000 > > > > > > > > > > > > We are experiencing very slow File Channel performance compared to the > > > previous version, and high amount of timeouts (almost always) in the Avro > > > RPC Clients and the Avro Sink. > > > > > > Something like this: > > > > > > · 2012-12-18 15:43:31,828 > > > [SinkRunner-PollingRunner-ExceptionCatchingSinkProcessor] WARN > > > org.apache.flume.sink.AvroSink - Failed to send event batch > > > org.apache.flume.EventDeliveryException: NettyAvroRpcClient { host: ***, > > > port: *** }: Failed to send batch > > > at > > > org.apache.flume.api.NettyAvroRpcClient.appendBatch(NettyAvroRpcClient.java:236) > > > ~[flume-ng-sdk-1.3.0.jar:1.3.0] > > > *** > > > at > > > org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147) > > > [flume-ng-core-1.3.0.jar:1.3.0] > > > at java.lang.Thread.run(Thread.java:662) [na:1.6.0_31] > > > Caused by: org.apache.flume.EventDeliveryException: NettyAvroRpcClient { > > > host: ***, port: *** }: Handshake timed out after 20000ms > > > at > > > org.apache.flume.api.NettyAvroRpcClient.appendBatch(NettyAvroRpcClient.java:280) > > > ~[flume-ng-sdk-1.3.0.jar:1.3.0] > > > at > > > org.apache.flume.api.NettyAvroRpcClient.appendBatch(NettyAvroRpcClient.java:224) > > > ~[flume-ng-sdk-1.3.0.jar:1.3.0] > > > ... 5 common frames omitted > > > Caused by: java.util.concurrent.TimeoutException: null > > > at > > > java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:228) > > > ~[na:1.6.0_31] > > > at java.util.concurrent.FutureTask.get(FutureTask.java:91) > > > ~[na:1.6.0_31] > > > at > > > org.apache.flume.api.NettyAvroRpcClient.appendBatch(NettyAvroRpcClient.java:278) > > > ~[flume-ng-sdk-1.3.0.jar:1.3.0] > > > ... 6 common frames omitted > > > > > > (I had to remove some details, sorry for that.) > > > > > > > > > > > > We managed to narrow down the root cause of the issue to the File Channel, > > > because: > > > > > > · Everything works fine if we switch to the Memory Channel or to the > > > Old File Channel (1.2.0). > > > > > > · Everything works fine if we use local directories. > > > > > > We’ve tested this on multiple different PCs (both Windows and Linux). > > > > > > > > > > > > I spent the day debugging and profiling, but I could not find anything > > > worth > > > mentioning (nothing with excessive CPU usage, no threads are waiting too > > > much, etc…). The only problem is that File Channel takes and puts take way > > > more time than with the previous version. > > > > > > > > > > > > > > > > > > Could someone please try the File Channel on an NFS share? > > > > > > Does anyone have similar issues? > > > > > > > > > > > > Thank you for your help. > > > > > > > > > > > > Regards, > > > > > > Rudolf > > > > > > > > > > > > Rudolf Rakos > > > Morgan Stanley | ISG Technology > > > Lechner Odon fasor 8 | Floor 06 > > > Budapest, 1095 > > > Phone: +36 1 881-4011 > > > rudolf.ra...@morganstanley.com (mailto:rudolf.ra...@morganstanley.com) > > > > > > > > > Be carbon conscious. Please consider our environment before printing this > > > email. > > > > > > > > > > > > > > > ________________________________ > > > > > > NOTICE: Morgan Stanley is not acting as a municipal advisor and the > > > opinions > > > or views contained herein are not intended to be, and do not constitute, > > > advice within the meaning of Section 975 of the Dodd-Frank Wall Street > > > Reform and Consumer Protection Act. If you have received this > > > communication > > > in error, please destroy all electronic and paper copies and notify the > > > sender immediately. Mistransmission is not intended to waive > > > confidentiality > > > or privilege. Morgan Stanley reserves the right, to the extent permitted > > > under applicable law, to monitor electronic communications. This message > > > is > > > subject to terms available at the following link: > > > http://www.morganstanley.com/disclaimers If you cannot access these links, > > > please notify us by reply message and we will send the contents to you. By > > > messaging with Morgan Stanley you consent to the foregoing. > > > > > > > > > > > > > -- > > Apache MRUnit - Unit testing MapReduce - http://incubator.apache.org/mrunit/ > > > > > > > -- > Apache MRUnit - Unit testing MapReduce - http://incubator.apache.org/mrunit/ > >