Yep. The disk space calls require an NFS call for each write, and that slows
things down a lot.
--
Hari Shreedharan
On Tuesday, December 18, 2012 at 8:43 AM, Brock Noland wrote:
We'd need those thread dumps to help confirm but I bet that FLUME-1609
results in a NFS call on each operation on the channel.
If that is true, that would explain why it works well on local disk.
Brock
On Tue, Dec 18, 2012 at 10:17 AM, Brock Noland <br...@cloudera.com> wrote:
Hi,
Hmm, yes in general performance is not going to be great over NFS, but
there haven't been any FC changes that stick out here.
Could you take 10 thread dumps of the agent running the file channel
and 10 thread dumps of the agent sending data to the agent with the
file channel? (You can address them to myself directly since the list
won't take attachements.)
Are there any patterns, like it works for 40 seconds then times out
and then works for 39 seconds, etc?
Brock
On Tue, Dec 18, 2012 at 10:07 AM, Rakos, Rudolf
<rudolf.ra...@morganstanley.com> wrote:
Hi,
We’ve run into a strange problem regarding NFS and File Channel performance
while evaluating the new version of Flume.
We had no issues with the previous version (1.2.0).
Our configuration looks like this:
· Node1:
(Avro RPC Clients ->) Avro Source and Custom Sources -> File Channel -> Avro
Sink (-> Node 2)
· Node2:
(Node1s ->) Avro Source -> File Channel -> Custom Sink
Both the checkpoint and the data directories of the File Channels are on NFS
shares. We use the same share for checkpoint and data directories, but
different shares for each Node. Unfortunately it is not an option for us to
use local directories.
The events are about 1KB large, and the batch sizes are the following:
· Avro RPC Clients: 1000
· Custom Sources: 2000
· Avro Sink: 5000
· Custom Sink: 10000
We are experiencing very slow File Channel performance compared to the
previous version, and high amount of timeouts (almost always) in the Avro
RPC Clients and the Avro Sink.
Something like this:
· 2012-12-18 15:43:31,828
[SinkRunner-PollingRunner-ExceptionCatchingSinkProcessor] WARN
org.apache.flume.sink.AvroSink - Failed to send event batch
org.apache.flume.EventDeliveryException: NettyAvroRpcClient { host: ***,
port: *** }: Failed to send batch
at
org.apache.flume.api.NettyAvroRpcClient.appendBatch(NettyAvroRpcClient.java:236)
~[flume-ng-sdk-1.3.0.jar:1.3.0]
***
at
org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
[flume-ng-core-1.3.0.jar:1.3.0]
at java.lang.Thread.run(Thread.java:662) [na:1.6.0_31]
Caused by: org.apache.flume.EventDeliveryException: NettyAvroRpcClient {
host: ***, port: *** }: Handshake timed out after 20000ms
at
org.apache.flume.api.NettyAvroRpcClient.appendBatch(NettyAvroRpcClient.java:280)
~[flume-ng-sdk-1.3.0.jar:1.3.0]
at
org.apache.flume.api.NettyAvroRpcClient.appendBatch(NettyAvroRpcClient.java:224)
~[flume-ng-sdk-1.3.0.jar:1.3.0]
... 5 common frames omitted
Caused by: java.util.concurrent.TimeoutException: null
at
java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:228)
~[na:1.6.0_31]
at java.util.concurrent.FutureTask.get(FutureTask.java:91)
~[na:1.6.0_31]
at
org.apache.flume.api.NettyAvroRpcClient.appendBatch(NettyAvroRpcClient.java:278)
~[flume-ng-sdk-1.3.0.jar:1.3.0]
... 6 common frames omitted
(I had to remove some details, sorry for that.)
We managed to narrow down the root cause of the issue to the File Channel,
because:
· Everything works fine if we switch to the Memory Channel or to the
Old File Channel (1.2.0).
· Everything works fine if we use local directories.
We’ve tested this on multiple different PCs (both Windows and Linux).
I spent the day debugging and profiling, but I could not find anything worth
mentioning (nothing with excessive CPU usage, no threads are waiting too
much, etc…). The only problem is that File Channel takes and puts take way
more time than with the previous version.
Could someone please try the File Channel on an NFS share?
Does anyone have similar issues?
Thank you for your help.
Regards,
Rudolf
Rudolf Rakos
Morgan Stanley | ISG Technology
Lechner Odon fasor 8 | Floor 06
Budapest, 1095
Phone: +36 1 881-4011
rudolf.ra...@morganstanley.com
Be carbon conscious. Please consider our environment before printing this
email.
________________________________
NOTICE: Morgan Stanley is not acting as a municipal advisor and the opinions
or views contained herein are not intended to be, and do not constitute,
advice within the meaning of Section 975 of the Dodd-Frank Wall Street
Reform and Consumer Protection Act. If you have received this communication
in error, please destroy all electronic and paper copies and notify the
sender immediately. Mistransmission is not intended to waive confidentiality
or privilege. Morgan Stanley reserves the right, to the extent permitted
under applicable law, to monitor electronic communications. This message is
subject to terms available at the following link:
http://www.morganstanley.com/disclaimers If you cannot access these links,
please notify us by reply message and we will send the contents to you. By
messaging with Morgan Stanley you consent to the foregoing.
--
Apache MRUnit - Unit testing MapReduce - http://incubator.apache.org/mrunit/
--
Apache MRUnit - Unit testing MapReduce - http://incubator.apache.org/mrunit/