Hi, Dag and All

Yes, we are using active-active(mode 7) for bond. 
 
VM A1 ---> VR A(Isloated Network A) ----> VR B(Isolated Network B)  ----> VM B1



After rounds of isoloation,  based on packet analysis,  it seems to us 
    - the traffic between  VM A1 and VR A is normal
    - however, between  VR A and VM B1,   VR A receives packets aknowledges 
from VM B1 which VR A thinks they has not sent thru it yet.
    - Then,  VR A reset the session, causing the traffic dropped. 


For  testing purpose, we turned off the TSO (tcp-segmentation-offload )on 
XenServer network adpaters by command 'ethtool -k eth0 tso off',   the issue is 
just gone, we can run iperf for testing without any drop for a couple of hours. 


Does it make sense ?  Any improvement can be implemented from ACS side ?


Thanks !




在2019年02月22 23时20分, "Haijiao"<18602198...@163.com>写道:


Thanks Dag,  you are always helpful !


We will look into your sharing and come back. 







在2019年02月22 17时26分, "Dag Sonstebo"<dag.sonst...@shapeblue.com>写道:

Hi Haijiao,

We've come across similar things in the past. In short - what is your XenServer 
bond mode? Is it active-active (mode 7) or LACP (mode 4)? (see 
https://support.citrix.com/article/CTX137599)

In short if your switches don't keep up with MAC address changes on the XS 
hosts then you will get traffic flapping with intermittent loss of connectivity 
(root cause is a MAC address moves to another uplink, but the switch only 
checks for changes every X seconds so it takes a while for it to catch up). 
LACP mode 4 has a much more robust mechanism for this but obviously needs 
configured both XS and switch end. Normal active-active (mode 7) seems to 
always cause problems.

My general advise would be to simplify and just go active-passive (mode 1) - 
unless you really need the bandwidth this gives you a much more stable network 
backend.

Regards,
Dag Sonstebo
Cloud Architect
ShapeBlue


On 22/02/2019, 07:14, "Haijiao" <18602198...@163.com> wrote:

   Hi, Devs and Community Users
   
   
   To be more specific,  our environment is built with
   * 2 Dell R740XD Servers + Dell Compellent Storage w/ iSCSI
   * Each server equiped with two Mellanox Connect-4 LX 25GbE network adapters, 
and configured with bond mode(active+active) in XenServer
   * CloudStack 4.11.2 LTS + XenServer 7.1CU2(LTS) Enterprise
   
   
   Everything goes fine with shared network, but the weird thing is if we setup 
2 isolated networks,  try to use 'iperf',  'wget' or 'SCP' to test the network 
performance betwen two VMs located in these 2 isolated networks,  the traffic 
will drop to zero in about 200-300 seconds,  even though we were still able to 
ping or SSH VM B1 from A1 or verse.
   
   
   VM A1 ---> VR A(Isloated Network A) ----> VR B(Isolated Network B)  ----> VM 
B1
   
----------------------------------------------------------------------------------------------------------------------------------------
   We have checked the configuration on switches, upgraded Mellanox driver for 
XenServer,  but no luck.
   Meanwhile, we can not re-produce this issue in another environment 
(XenServer 7.1CU2+ACS 4.11.2+ Intel Gb network).
   
   
    It seems it might be related to Mellanox adapter, but we have no idea what 
part we could possibly miss in this case.
   
   
   Any advice would be highly appreciated !   Thank you !
   
   
   在2019年02月22 13时09分, "gu haven"<gumin...@hotmail.com>写道:
   
   
   hi ,all
         I try iperf wget scp connection will break after 200 seconds ,Do  need 
any optimization in vr ?
         
   environment infomation below:        
   
   cloudstack 4.11.2
   
   xenserver 7.1 CU2 Enterprise
   
   NIC :MLNX 25GbE 2P ConnectX4LX
   
   bond mode in xenserver : acitce-active
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   


dag.sonst...@shapeblue.com 
www.shapeblue.com
Amadeus House, Floral Street, London  WC2E 9DPUK
@shapeblue
 







 

Reply via email to