Re: problem with qpid heartbeats when sending msgs with size over 1KB

Gordon Sim Wed, 07 Dec 2011 06:38:24 -0800

On 12/07/2011 01:55 AM, Tom M wrote:

Hello,


we are having a problem with our MRG (qpid) system:

* when sending messages with size of 1600bytes, a connection (used for
sending from client) does not detect the host connection is lost via
heartbeat timeout.

+ we are using C++ qpid client 0.7 and qpidd 0.7 (linux 2.6 x86_64 on both
client and broker hosts)

and Ethernet connection (TCP/IP) between hosts

     + for this connection we have: ConnectionSettings
connectionSettings.heartbeat = 8

     + simulating a system failure by pulling the ethernet cable to the
broker host

     + the connection close Exception is caught by the client after many
minutes (6 to 20mins), I'm guessing this is due to the TCP timeout and not
the missed heartbeats.

     + with the same exact application (for our client), if sending messages
of 200bytes, we do get the qpid exception indicating the Connection closed
(catch TransportFailure Exception: connection closed) within 16 seconds.
For this testing, there were no other changes between the 2 cases, other
than the size of the messages sent from the client (only expanded the size
of the string in the body of the message) (1 message sent per second in
both cases).

* is this a known problem with qpid 0.7?


No, i don't think this is a known issue.

* is there patch to fix this for qpid 0.7?

* has this problem already been fixed in later releases?

NOTE: we have already deployed qpid 0.7 in our system, and we will not be
able to upgrade to a newer full release for many months.

I'm wondering if the problem is that the connection gets blocked with the
first TCP packet of a multiple packet message, such that the heartbeat
detection is disabled until the full message is sent. But, if the
multi-packet message can not complete (since socket is broken), the
heartbeat logic is held disabled until the multi-packet message can
complete (which in this case it can not).

There is nothing that directly (intentionally) does anything like this.However it may be possible that there is some deadlock or liveness issuethat prevents correct function in some cases.

Is the test always failing with the larger message size? There isactually no difference in the AMQP framing for a 200 byte v a 1600 bytemessage. It may just be that the different timing of the larger writesomehow triggers the issue.

Can you get trace level logs and a thread dump from the client for afailed case?


---------------------------------------------------------------------
Apache Qpid - AMQP Messaging Implementation
Project:      http://qpid.apache.org
Use/Interact: mailto:[email protected]

Re: problem with qpid heartbeats when sending msgs with size over 1KB

Reply via email to