[ 
https://issues.apache.org/activemq/browse/SM-625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_40375
 ] 

Oleg Zhurakousky commented on SM-625:
-------------------------------------

This is another test that doesn't do any assertions however it is pretty 
obvious what it is trying to do and I think I can fix it. But before. . .
Here is a quick summary of what it does:

Starts two containers with two flows each (JCA and JMS), defines multiple 
Senders and Receivers and then attempts to send a batch of 100 messages using 
all possible combinations of Senders and Receivers.

The reason for this tests instability is because it only works if Demand 
Forwarding Bridge between the two containers is established. Tracing down the 
exception I got when I removed Thread.sleep(5000) and adding some logging in 
the class in question here is what I got from 
EndpointResolverSupport.resolveEndpoint(..)
22:57:30,484 | INFO  | main | EndpointResolverSupport  | 
resolver.EndpointResolverSupport   44 | ==> ServiceEndpoint[] size: 0

Here is the output when I did put thread back to sleep and gave both containers 
enough time to recognize one another's existence. 

22:59:08,000 | INFO  | main             | EndpointResolverSupport          | 
resolver.EndpointResolverSupport   44 | ==> ServiceEndpoint[] size: 1
22:59:08,000 | INFO  | main             | EndpointResolverSupport          | 
resolver.EndpointResolverSupport   46 | ==> Endpoint: 
ServiceEndpoint[service=remoteReceiver,endpoint=remoteReceiver]

Obviously there are several ways of fixing it, but I also suspect that it could 
be something bigger, since the same error will occur if Node 1 has Service A 
and Node 2 has Service B as destination service of Service A. If Node 2 is not 
up, then you gonna have error. However we have to realize that there are many 
reasons for Mode 2 (in this scenario) not be up. Node 2 might not be up because 
no one turned it "on" yet OR because it is coming up and perhaps we should give 
it an extra time to boot. In other words what I am trying to say is that I see 
the need for retry logic around this line of code in the 
EndpointResolverSupport with externally configurable timeout:

ServiceEndpoint[] endpoints = resolveAvailableEndpoints(context, exchange);

This will fix this problem, but most importantly we can provide a more explicit 
message about what happened once we go passed the timeout (the message is 
already explicit: Failed to resolve endpoint: 
org.apache.servicemix.jbi.NoServiceAvailableException: Cannot find an instance 
of the service: remoteReceiver. But we can also output a potential reason which 
could help end user to quickly realize the problem).  
What do you think?



> Failed unit test (servicemix-core) : 
> org.apache.servicemix.jbi.nmr.flow.MultipleFlowsTest
> -----------------------------------------------------------------------------------------
>
>                 Key: SM-625
>                 URL: https://issues.apache.org/activemq/browse/SM-625
>             Project: ServiceMix
>          Issue Type: Sub-task
>          Components: servicemix-core
>    Affects Versions: 3.0
>            Reporter: Fritz Oconer
>             Fix For: 3.2
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to