Gordon,
thank you for your response.

Yes, the test with larger message has failed every time.
I will look at sending the trace logs and a thread dump from the client
from our system.

thanks,
Tom Maggio
Raytheon Co.
Dallas, TX
972/205-4377

On Wed, Dec 7, 2011 at 8:37 AM, Gordon Sim <[email protected]> wrote:

> On 12/07/2011 01:55 AM, Tom M wrote:
>
>> Hello,
>>
>> we are having a problem with our MRG (qpid) system:
>>
>> * when sending messages with size of 1600bytes, a connection (used for
>> sending from client) does not detect the host connection is lost via
>> heartbeat timeout.
>>
>> + we are using C++ qpid client 0.7 and qpidd 0.7 (linux 2.6 x86_64 on both
>> client and broker hosts)
>>
>> and Ethernet connection (TCP/IP) between hosts
>>
>>     + for this connection we have: ConnectionSettings
>> connectionSettings.heartbeat = 8
>>
>>     + simulating a system failure by pulling the ethernet cable to the
>> broker host
>>
>>     + the connection close Exception is caught by the client after many
>> minutes (6 to 20mins), I'm guessing this is due to the TCP timeout and not
>> the missed heartbeats.
>>
>>     + with the same exact application (for our client), if sending
>> messages
>> of 200bytes, we do get the qpid exception indicating the Connection closed
>> (catch TransportFailure Exception: connection closed) within 16 seconds.
>> For this testing, there were no other changes between the 2 cases, other
>> than the size of the messages sent from the client (only expanded the size
>> of the string in the body of the message) (1 message sent per second in
>> both cases).
>>
>> * is this a known problem with qpid 0.7?
>>
>
> No, i don't think this is a known issue.
>
>
> * is there patch to fix this for qpid 0.7?
>>
>> * has this problem already been fixed in later releases?
>>
>> NOTE: we have already deployed qpid 0.7 in our system, and we will not be
>> able to upgrade to a newer full release for many months.
>>
>> I'm wondering if the problem is that the connection gets blocked with the
>> first TCP packet of a multiple packet message, such that the heartbeat
>> detection is disabled until the full message is sent. But, if the
>> multi-packet message can not complete (since socket is broken), the
>> heartbeat logic is held disabled until the multi-packet message can
>> complete (which in this case it can not).
>>
>
> There is nothing that directly (intentionally) does anything like this.
> However it may be possible that there is some deadlock or liveness issue
> that prevents correct function in some cases.
>
> Is the test always failing with the larger message size? There is actually
> no difference in the AMQP framing for a 200 byte v a 1600 byte message. It
> may just be that the different timing of the larger write somehow triggers
> the issue.
>
> Can you get trace level logs and a thread dump from the client for a
> failed case?
>
> ------------------------------**------------------------------**---------
> Apache Qpid - AMQP Messaging Implementation
> Project:      http://qpid.apache.org
> Use/Interact: 
> mailto:users-subscribe@qpid.**apache.org<[email protected]>
>
>

Reply via email to