https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=211990
--- Comment #4 from Ben RUBSON <ben.rub...@gmail.com> --- One strange thing I noticed. (I put all things that could be interesting from my troubleshooting) As soon as I put the network interface down, I get the following message on target side, one per target : 17:01:00 srv2 kernel: WARNING: 192.168.2.1 (iqn.1994-09.org.freebsd:srv1): no ping reply (NOP-Out) after 5 seconds; dropping connection Then, on initiator side, I get these messages for each target : Aug 19 17:01:07 srv1 kernel: iscsi_maintenance_thread_reconnect: 192.168.2.2 (iqn.2012-06.srv2:hm4): connection failed, destroying devices Aug 19 17:01:07 srv1 kernel: iscsi_session_cleanup: 192.168.2.2 (iqn.2012-06.srv2:hm4): freezing Aug 19 17:01:07 srv1 kernel: iscsi_session_cleanup: 192.168.2.2 (iqn.2012-06.srv2:hm4): deregistering SIM At this moment, on initiator side, one iscsid process per target appears. 10 seconds later, on initiator side, I get these messages for each target : Aug 19 17:01:18 srv1 kernel: WARNING: 192.168.2.2 (iqn.2012-06.srv2:hm4): login timed out after 11 seconds; reconnecting Aug 19 17:01:18 srv1 kernel: iscsi_maintenance_thread_reconnect: 192.168.2.2 (iqn.2012-06.srv2:hm4): connection failed, destroying devices And at the same time, a second iscsid process per target appears, so that I get 2 iscsid processes per target : # ps auxxw | grep iscsid: root 866 0.0 0.0 16632 2144 - I 4:58pm 0:00.00 iscsid: 192.168.2.2 (iqn.2012-06.srv2:hm4) (iscsid) root 881 0.0 0.0 16632 2144 - I 4:58pm 0:00.00 iscsid: 192.168.2.2 (iqn.2012-06.srv2:hm4) (iscsid) (...) However sounds like there is a limit to 30 processes, as for 17 targets I would have expected 34 processes, but I only get 30. If I put the NIC up before the second process is created, I only get one reconnection message per target in target logs. If I put the NIC up after the second process is created, I get a lot more reconnection messages in target logs, between 40 and 50 for 17 targets. Do we expect these additional processes ? I think we would only expect one process / one reconnection message per target ? Seems strange to have all these "duplicated" connection retries. Another related question to the "30" processes found : Is there any limit to 30 targets ? I found a maxproc option in ctl.conf (default to 30) but I don't exactly know what it means (I tested values of 1 to 50 without seeing any change). No option found however on initiator side. I noticed that we can reproduce this bug easier when we "stress" the devices : disconnect network as soon as targets are reconnected, and reconnect it as soon as they are disconnected. Additionally to this, I had 8 kernel crashes, initator or target, each time with the same address / pointer : kernel: Fatal trap 12: page fault while in kernel mode kernel: fault virtual address = 0x1e8 kernel: instruction pointer = 0x20:0xffffffff80936933 I also got a stacktrace, but did not get it's pointer address. http://img4.hostingpics.net/pics/707217211990.png I'm also trying to get a full dump. However I'm not sure this kernel crash issue is related to the reconnection issue, perhaps there are 2 issues. # uname -v FreeBSD 10.3-RELEASE-p7 #0: Thu Aug 11 18:38:15 UTC 2016 r...@amd64-builder.daemonology.net:/usr/obj/usr/src/sys/GENERIC A lot of info ! I hope we will be able to correct these issues. Many thanks, Ben -- You are receiving this mail because: You are the assignee for the bug. _______________________________________________ freebsd-bugs@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-bugs To unsubscribe, send any mail to "freebsd-bugs-unsubscr...@freebsd.org"