Navdeep / List Can you help me understand what I am looking at here. I enabled the lacp debug until I finally saw the issue I noted before. Due to some log rotation part of the message is clipped. Here is a part the full thing is on patebin https://pastebin.com/BGtbxcBf
30 2020-07-09T15:47:04.145885+00:00 ch1-c104-sdn02-mgmt kernel: sfxge0: lacp_sm_rx_timer: CURRENT -> EXPIRED 31 2020-07-09T15:47:04.145895+00:00 ch1-c104-sdn02-mgmt kernel: sfxge0: Interface stopped DISTRIBUTING, possible flapping 32 2020-07-09T15:47:04.145895+00:00 ch1-c104-sdn02-mgmt kernel: sfxge0: collecting enabled 33 2020-07-09T15:47:04.145896+00:00 ch1-c104-sdn02-mgmt kernel: sfxge0: disable distributing on aggregator [(8000,00-0F-53-69-7C-20,00F2,0000,0000),(8000,46-4C-A8-68-13-47,006A,0000,0000)], nports 2 -> 1 34 2020-07-09T15:47:04.145911+00:00 ch1-c104-sdn02-mgmt kernel: sfxge1: lacp_select_tx_port: waiting transit 35 2020-07-09T15:47:04.145912+00:00 ch1-c104-sdn02-mgmt kernel: marker transmit, port=6, sys=00:0f:53:69:7c:20, id=487 36 2020-07-09T15:47:04.145912+00:00 ch1-c104-sdn02-mgmt kernel: sfxge0: sfxge1: lacp_select_tx_port: waiting transit 37 2020-07-09T15:47:04.145913+00:00 ch1-c104-sdn02-mgmt kernel: marker transmit, port=5, sys=00:0f:53:69:7c:20, id=487 38 2020-07-09T15:47:04.145914+00:00 ch1-c104-sdn02-mgmt kernel: marker response, port=6, sys=00:0f:53:69:7c:20, id=487 39 2020-07-09T15:47:04.145914+00:00 ch1-c104-sdn02-mgmt kernel: [(8000,00-0F-53-69-7C-20,00F2,0000,0000),(8000,46-4C-A8-68-13-47,006A,0000,0000)], speed=10000000000, nports=1 40 2020-07-09T15:47:04.145922+00:00 ch1-c104-sdn02-mgmt kernel: sfxge0: lacp_select_tx_port: waiting transit 41 2020-07-09T15:47:04.145923+00:00 ch1-c104-sdn02-mgmt kernel: lacp_select_tx_port: waiting transit 42 2020-07-09T15:47:04.145923+00:00 ch1-c104-sdn02-mgmt kernel: active aggregator not changed 43 2020-07-09T15:47:04.145924+00:00 ch1-c104-sdn02-mgmt kernel: lacp_select_tx_port: waiting transit 44 2020-07-09T15:47:04.145925+00:00 ch1-c104-sdn02-mgmt kernel: marker response, port=5, sys=00:0f:53:69:7c:20, id=487 45 2020-07-09T15:47:04.145925+00:00 ch1-c104-sdn02-mgmt kernel: lacp_select_tx_port: waiting transit 46 2020-07-09T15:47:04.145926+00:00 ch1-c104-sdn02-mgmt kernel: new [(8000,00-0F-53-69-7C-20,00F2,0000,0000),(8000,46-4C-A8-68-13-47,006A,0000,0000)] 47 2020-07-09T15:47:04.145926+00:00 ch1-c104-sdn02-mgmt kernel: Set table 1 with 1 ports 48 2020-07-09T15:47:04.145927+00:00 ch1-c104-sdn02-mgmt kernel: sfxge0: mux_state 4 -> 3 49 2020-07-09T15:47:04.145928+00:00 ch1-c104-sdn02-mgmt kernel: sfxge0: collecting disabled 50 2020-07-09T15:47:04.145930+00:00 ch1-c104-sdn02-mgmt kernel: sfxge0: mux_state 3 -> 2 51 2020-07-09T15:47:04.145931+00:00 ch1-c104-sdn02-mgmt kernel: sfxge0: lacpdu transmit 52 2020-07-09T15:47:04.145931+00:00 ch1-c104-sdn02-mgmt kernel: actor=(8000,00-0F-53-69-7C-20,00F2,8000,0005) 53 2020-07-09T15:47:04.145932+00:00 ch1-c104-sdn02-mgmt kernel: actor.state=8d<ACTIVITY,AGGREGATION,SYNC,EXPIRED> 54 2020-07-09T15:47:04.145932+00:00 ch1-c104-sdn02-mgmt kernel: partner=(8000,46-4C-A8-68-13-47,006A,8000,0006) 55 2020-07-09T15:47:04.145933+00:00 ch1-c104-sdn02-mgmt kernel: partner.state=37<ACTIVITY,TIMEOUT,AGGREGATION,COLLECTING,DISTRIBUTING> 56 2020-07-09T15:47:04.145935+00:00 ch1-c104-sdn02-mgmt kernel: maxdelay=0 57 2020-07-09T15:47:04.145937+00:00 ch1-c104-sdn02-mgmt kernel: queue flush complete --- Mark Saad mark.s...@lucera.com ________________________________________ From: owner-freebsd-...@freebsd.org <owner-freebsd-...@freebsd.org> on behalf of Saad, Mark <mark.s...@lucera.com> Sent: Tuesday, June 16, 2020 8:31 PM To: Navdeep Parhar Cc: Foster, Greg; freebsd-net@freebsd.org Subject: Re: How to Increase TX Queue Priority for LACP Packets Navdeep Thanks for getting back ; I’ll do some digging. Back to the question about running with LACP debug on . Does this put the nics into promiscuous mode ? --- Mark Saad | mark.s...@lucera.com > On Jun 16, 2020, at 8:13 PM, Navdeep Parhar <n...@freebsd.org> wrote: > > We could have a global knob that tells all NIC drivers to use a reserved > queue for non-RSS traffic, but that would be advisory at best because > the tx queue selection takes place inside the driver's (or iflib's) > transmit routine. The meat of the change is going to be in iflib and > all non-iflib drivers' if_transmit. > > Regards, > Navdeep > >> On Tue, Jun 16, 2020 at 09:48:19PM +0000, Saad, Mark wrote: >> All >> Is there any way to make this change on other nic's like Intel ix and >> Solarflare sfxge ? I have seen similar issues on both with 12.1 >> mainly with solarflare nics. >> >> --- >> Mark Saad >> mark.s...@lucera.com >> >> >> ________________________________________ >> From: owner-freebsd-...@freebsd.org <owner-freebsd-...@freebsd.org> on >> behalf of Foster, Greg <gfos...@panasas.com> >> Sent: Tuesday, June 16, 2020 3:56 PM >> To: Navdeep Parhar >> Cc: freebsd-net@freebsd.org >> Subject: RE: How to Increase TX Queue Priority for LACP Packets >> >> HI Navdeep, >> >> Thanks for the information! I've integrated the changes and will be >> testing more today. >> >> We have seen the LACP port flapping under different scenarios, most we >> believe are traffic/load based. >> >> I did see the flapping unexpectedly when I just enabled LACP debug >> (e.g., sysctl net.link.lagg.lacp.debug=1). Is this a known >> problem? >> >> Thanks >> Greg >> >> -----Original Message----- >> From: Navdeep Parhar <npar...@gmail.com> On Behalf Of Navdeep Parhar >> Sent: Friday, June 12, 2020 7:51 PM >> To: Foster, Greg <gfos...@panasas.com> >> Cc: freebsd-net@freebsd.org >> Subject: Re: How to Increase TX Queue Priority for LACP Packets >> >>> On Fri, Jun 12, 2020 at 11:47:41PM +0000, Foster, Greg wrote: >>> FreeBSD Networkers, >>> >>> We are seeing LACP port flapping on our FreeBSD 10.4/12.1 systems >>> under different conditions. >>> >>> Can someone explain or point me to the information on how to queue >>> the LACP packets to a higher priority queue ? >>> >>> We are using the Chelsio T580-LP-CR adapter/cxgbe driver. The >>> Cheslio NICs have 8 TX/RX queues each, but I don't know how to >>> explicitly put the LACP packets in the higher priority TX queue. >>> >>> I've read about PF/ALTQ and think this may be overkill our needs, >>> and was wondering if there was a simpler method. >> >> This is cxgbe specific but that's what you're using so it'll do. >> >> Add "hw.cxgbe.rsrv_noflowq=1" to your /boot/loader.conf. That >> reserves one tx queue for non-RSS traffic (like ARP, LACP). You might >> also want to increase the number of tx queues to compensate for the >> one that's now reserved. Use "hw.cxgbe.ntxq=9" for that. The ntxq >> knob might be different on 10.4 but the man page matching the driver >> should have its exact name. >> >> Regards, >> Navdeep >> >> _______________________________________________ >> freebsd-net@freebsd.org mailing list >> https://lists.freebsd.org/mailman/listinfo/freebsd-net >> To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org" >> > _______________________________________________ freebsd-net@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org" _______________________________________________ freebsd-net@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"