Re: Route messages

2008-06-27 Thread mike
On Sun, 15 Jun 2008 11:16:17 +0100, in sentex.lists.freebsd.net you
wrote:

>Paul wrote:
>> Get these with GRE tunnel on
>> FreeBSD 7.0-STABLE FreeBSD 7.0-STABLE #5: Sun May 11 19:00:57 EDT 
>> 2008 :/usr/obj/usr/src/sys/ROUTER  amd64
>> But do not get them with 7.0-RELEASE
>>
>> Any ideas what changed? :)  Wish there was some sort of changelog..
>> # of messages per second seems consistent with packets per second on 
>> GRE interface..
>> No impact in routing, but definitely impact in cpu usage for all 
>> processes monitoring the route messages.
>
>RTM_MISS is actually fairly common when you don't have a default route.
>

Hi,
I am seeing this issue as well on a pair of  recently deployed
boxes, one  running MPD and one acting as an area router in front of
it. The MPD box has a default route and only has 400 routes or so.

A steady stream of those messages, upwards of 500 per second. 

got message of size 96 on Fri Jun 27 22:25:42 2008
RTM_MISS: Lookup failed on this address: len 96, pid: 0, seq 0, errno
0, flags:
locks:  inits: 
sockaddrs: 
 default

got message of size 96 on Fri Jun 27 22:25:42 2008
RTM_MISS: Lookup failed on this address: len 96, pid: 0, seq 0, errno
0, flags:
locks:  inits: 
sockaddrs: 
 default

Is there a way to try and track down what is generating those messages
? Its eating up a fair bit of cpu with quagga (the zebra process
specifically)

---Mike
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Troubles with em on FreeBSD 7

2008-05-04 Thread mike
On Sat, 03 May 2008 18:28:55 +0300, in sentex.lists.freebsd.net you
wrote:

>Hi!
>
>I'm running a SMP FreeBSD box with mpd5 on it.
>
># uname -a
>FreeBSD xxx.x.xxx 7.0-STABLE FreeBSD 7.0-STABLE #0: Sat May  3 
>12:40:02 EEST 2008 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/ 
>  amd64
>
># mpd5 -v
>Version 5.1 ([EMAIL PROTECTED] 09:53  1-May-2008)
>
>Somehow em0 begins to eat all CPU time of one core.
>

A new version of the em drivers went into the tree Friday.

>dev.em.0.%desc: Intel(R) PRO/1000 Network Connection Version - 6.7.3

dev.em.0.%desc: Intel(R) PRO/1000 Network Connection 6.9.0
dev.em.0.%driver: em
dev.em.0.%location: slot=0 function=0
dev.em.0.%pnpinfo: vendor=0x8086 device=0x108c subvendor=0x15d9
subdevice=0x108c class=0x02
dev.em.0.%parent: pci13
dev.em.0.debug: -1
dev.em.0.stats: -1
dev.em.0.rx_int_delay: 0
dev.em.0.tx_int_delay: 66
dev.em.0.rx_abs_int_delay: 66
dev.em.0.tx_abs_int_delay: 66
dev.em.0.rx_processing_limit: 100

Also, post some of the stats.  Do a 
sysctl -w dev.em.1.stats=1
to all of your em nics

em1: Excessive collisions = 0
em1: Sequence errors = 0
em1: Defer count = 0
em1: Missed Packets = 0
em1: Receive No Buffers = 0
em1: Receive Length Errors = 0
em1: Receive errors = 0
em1: Crc errors = 0
em1: Alignment errors = 0
em1: Collision/Carrier extension errors = 0
em1: RX overruns = 0
em1: watchdog timeouts = 0
em1: RX MSIX IRQ = 0 TX MSIX IRQ = 0 LINK MSIX IRQ = 0
em1: XON Rcvd = 0
em1: XON Xmtd = 0
em1: XOFF Rcvd = 0
em1: XOFF Xmtd = 0
em1: Good Packets Rcvd = 71949
em1: Good Packets Xmtd = 2507
em1: TSO Contexts Xmtd = 369
em1: TSO Contexts Failed = 0

And are you using gigabit or fastE. If fastE, try disabling TSO as
some people have said they have problems with it at 100Mb. 

---Mike
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


translate from iptables

2001-03-15 Thread Mike

 Hi all,

 I'm trying to move everything from my RH Linux box to FreeBSD. So far
everything is great on BSD and I'm very glad I switched. The only thing I
have left to get functioning is Bnetd, the starcraft/diablo server. I have
it "working" but there is a problem with people playing behind the firewall
with people on the outside. To make a long story short, the following
snippet is from the iptables firewall on my Linux box. It is what solved the
problem and got everything working great:
**
iptables -t nat -A PREROUTING -p udp -d 216.238.130.251 --dport 63010 -j
DNAT --to-destination 192.168.0.2:6112
iptables -t nat -A PREROUTING -p udp -d 216.238.130.251 --dport 63011 -j
DNAT --to-destination 192.168.0.3:6112
iptables -t nat -A PREROUTING -p udp -d 216.238.130.251 --dport 63012 -j
DNAT --to-destination 192.168.0.4:6112

iptables -t nat -A POSTROUTING -p udp -s 192.168.0.2 --sport 6112 -j
SNAT --to-source 216.238.130.251:63010
iptables -t nat -A POSTROUTING -p udp -s 192.168.0.3 --sport 6112 -j
SNAT --to-source 216.238.130.251:63011
iptables -t nat -A POSTROUTING -p udp -s 192.168.0.4 --sport 6112 -j
SNAT --to-source 216.238.130.251:63012
***
  What I'm hoping is that someone will be able to tell me a way to do this
same thing using natd or ipfwd or something like that. Any hints or help
would be much appreciated:)

Mike


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-net" in the body of the message



Re: translate from iptables

2001-03-16 Thread Mike

Thanks, I'll see what I can find out from there:)

Mike


> On Fri, Mar 16, 2001 at 01:53:25AM -0500, Mike wrote:
> >   What I'm hoping is that someone will be able to tell me a way to do
this
> > same thing using natd or ipfwd or something like that. Any hints or help
> > would be much appreciated:)
>
> ipfwd isn't what you want, natd is. read the man page on natd (there
> is an installation guide in the page) and read the parts of the
> configuration for "-redirect_port".
>
> --
> Bill Fumerola - security yahoo / Yahoo! inc.
>   - [EMAIL PROTECTED] / [EMAIL PROTECTED]



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-net" in the body of the message



[Differential] [Changed Subscribers] D1965: Add extended media types to if_media.h and ifconfig

2015-03-01 Thread mike-karels.net (Mike Karels)
mike-karels.net added a subscriber: mike-karels.net.

BRANCH
  /head

REVISION DETAIL
  https://reviews.freebsd.org/D1965

To: erj, adrian, jfvogel, gnn
Cc: mike-karels.net, glebius, freebsd-net
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


[Differential] [Commented On] D1965: Add extended media types to if_media.h and ifconfig

2015-03-01 Thread mike-karels.net (Mike Karels)
mike-karels.net added a comment.

>>! In D1965#11, @gnn wrote:
> BTW Mike Karels was in favor of this in an email thread.  He's not yet on 
> phabricator but I'll ask him here as well.

Agreed, with minor exceptions. Given the ixl changes, I don't think the vtnet 
example is worth keeping, and I think that the VFAST/V.fast entry should be 
removed and other subtypes moved down.

BRANCH
  /head

REVISION DETAIL
  https://reviews.freebsd.org/D1965

To: erj, adrian, jfvogel, gnn
Cc: mike-karels.net, glebius, freebsd-net
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


[Differential] [Commented On] D5872: tcp: Don't prematurely drop receiving-only connections

2016-04-15 Thread mike-karels.net (Mike Karels)
mike-karels.net added a comment.


  I believe that the original code is wrong, and the change is not sufficient
  to fix it.  The retransmit timer should be running if and only if we have
  sent data into the receive window and are awaiting an ACK.  The persist
  timer should be running if and only if the retransmit timer is not running,
  and we have data to send that do not reasonably fit in the window.  If we
  get an ENOBUFS when sending data, we will already be running the retransmit
  timer.  If we drop an ACK on ENOBUFS, either we will receive more data and
  attempt another ACK, or the sender will time out and resend data.  Either
  will get the connection started again.  I believe lines 1552-1554 should
  simply be deleted.

REVISION DETAIL
  https://reviews.freebsd.org/D5872

EMAIL PREFERENCES
  https://reviews.freebsd.org/settings/panel/emailpreferences/

To: sepherosa_gmail.com, network, glebius, lstewart, adrian, delphij, 
decui_microsoft.com, honzhan_microsoft.com, howard0su_gmail.com, 
freebsd-net-list, transport, jtl, hiren
Cc: mike-karels.net, jtl
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


[Differential] [Commented On] D5872: tcp: Don't prematurely drop receiving-only connections

2016-04-16 Thread mike-karels.net (Mike Karels)
mike-karels.net added a comment.


  Setting a retransmission timer on an ACK makes no sense; I don't think 
tcp_output will send an ACK on a retransmission timeout.
  
  Setting timers in the ENOBUFS case is at best a partial fix.  If the ACK is 
lost locally, we know; if it is lost elsewhere, we don't. We need timers in any 
case.
  
  Setting  a local timer for the ACK case is no better for latency than having 
the other end run a retransmit timer..
  
  If there is a problem with setting the retransmit timer for a FIN, let's fix 
that.  Otherwise, I stand by my recommendation of deleting the code to set a 
timer.

REVISION DETAIL
  https://reviews.freebsd.org/D5872

EMAIL PREFERENCES
  https://reviews.freebsd.org/settings/panel/emailpreferences/

To: sepherosa_gmail.com, network, glebius, lstewart, adrian, delphij, 
decui_microsoft.com, honzhan_microsoft.com, howard0su_gmail.com, 
freebsd-net-list, transport, jtl, hiren
Cc: mike-karels.net, jtl
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


[Differential] D5872: tcp: Don't prematurely drop receiving-only connections

2016-04-20 Thread mike-karels.net (Mike Karels)
mike-karels.net added a comment.


  btw, I think the line to set the snd_cwnd should remain for now, until 
something replaces it.  ENOBUFS signals local congestion.

REVISION DETAIL
  https://reviews.freebsd.org/D5872

EMAIL PREFERENCES
  https://reviews.freebsd.org/settings/panel/emailpreferences/

To: sepherosa_gmail.com, network, glebius, adrian, delphij, 
decui_microsoft.com, honzhan_microsoft.com, howard0su_gmail.com, 
freebsd-net-list, transport, jtl, hiren, lstewart
Cc: gnn, mike-karels.net, jtl
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


[Differential] D5872: tcp: Don't prematurely drop receiving-only connections

2016-04-27 Thread mike-karels.net (Mike Karels)
mike-karels.net added a comment.


  I disagree; congestion is congestion, not "congestion for everyone but me".  
I'd prefer to leave the cwnd change until it is replaced by something more 
modern.

REVISION DETAIL
  https://reviews.freebsd.org/D5872

EMAIL PREFERENCES
  https://reviews.freebsd.org/settings/panel/emailpreferences/

To: sepherosa_gmail.com, network, glebius, adrian, delphij, 
decui_microsoft.com, honzhan_microsoft.com, howard0su_gmail.com, 
freebsd-net-list, lstewart, hiren, jtl, transport
Cc: gnn, mike-karels.net, jtl
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Poor performance with stable/13 and Mellanox ConnectX-6 (mlx5)

2022-06-13 Thread Mike Jakubik
tes

[  5]   2.00-3.00   sec   880 MBytes  7.38 Gbits/sec    0    785 KBytes

[  5]   3.00-4.00   sec   734 MBytes  6.16 Gbits/sec    0    804 KBytes

[  5]   4.00-5.00   sec   777 MBytes  6.52 Gbits/sec    0    824 KBytes

[  5]   5.00-6.00   sec   719 MBytes  6.03 Gbits/sec    0    841 KBytes

[  5]   6.00-7.00   sec   865 MBytes  7.26 Gbits/sec    0    862 KBytes

[  5]   7.00-8.00   sec   880 MBytes  7.38 Gbits/sec    0    882 KBytes

[  5]   8.00-9.00   sec   906 MBytes  7.60 Gbits/sec    0    904 KBytes

[  5]   9.00-10.00  sec   749 MBytes  6.29 Gbits/sec    0    921 KBytes

[  5]  10.00-11.00  sec   798 MBytes  6.69 Gbits/sec    0    938 KBytes

[  5]  11.00-12.00  sec   746 MBytes  6.26 Gbits/sec  209    772 KBytes

[  5]  12.00-13.00  sec   768 MBytes  6.44 Gbits/sec   35    644 KBytes

[  5]  13.00-14.00  sec   948 MBytes  7.95 Gbits/sec    0    673 KBytes

[  5]  14.00-15.00  sec  1.23 GBytes  10.5 Gbits/sec    0    711 KBytes

[  5]  15.00-16.00  sec  1.32 GBytes  11.4 Gbits/sec    0    748 KBytes

[  5]  16.00-17.00  sec  1.31 GBytes  11.2 Gbits/sec    0    785 KBytes

[  5]  17.00-18.00  sec  1.29 GBytes  11.1 Gbits/sec    0    819 KBytes

[  5]  18.00-19.00  sec  1.30 GBytes  11.2 Gbits/sec    0    852 KBytes

[  5]  19.00-20.00  sec  1.34 GBytes  11.5 Gbits/sec    0    883 KBytes

[  5]  20.00-21.00  sec  1.29 GBytes  11.1 Gbits/sec    0    914 KBytes

[  5]  21.00-22.00  sec  1.36 GBytes  11.7 Gbits/sec    0    944 KBytes

[  5]  22.00-23.00  sec  1.33 GBytes  11.4 Gbits/sec    0    974 KBytes

[  5]  23.00-24.00  sec  1.31 GBytes  11.2 Gbits/sec    0   1003 KBytes

[  5]  24.00-25.00  sec  1.30 GBytes  11.2 Gbits/sec    0   1.00 MBytes

[  5]  25.00-26.00  sec  1.34 GBytes  11.5 Gbits/sec    0   1.03 MBytes

[  5]  26.00-27.00  sec  1.32 GBytes  11.3 Gbits/sec    0   1.06 MBytes

[  5]  27.00-28.00  sec   957 MBytes  8.03 Gbits/sec    0   1.07 MBytes

[  5]  28.00-29.00  sec   837 MBytes  7.02 Gbits/sec    0   1.09 MBytes

[  5]  29.00-30.00  sec   729 MBytes  6.11 Gbits/sec    0   1.10 MBytes

- - - - - - - - - - - - - - - - - - - - - - - - -

[ ID] Interval   Transfer Bitrate Retr

[  5]   0.00-30.00  sec  30.6 GBytes  8.77 Gbits/sec  244 sender

[  5]   0.00-30.00  sec  30.6 GBytes  8.77 Gbits/sec  receiver







More data can be found @ 
https://forums.freebsd.org/threads/poor-performance-with-stable-13-and-mellanox-connectx-6-mlx5.85460/






Mike Jakubik

https://www.swiftsmsgateway.com/



Disclaimer: This e-mail and any attachments are intended only for the use of 
the addressee(s) and may contain information that is privileged or 
confidential. If you are not the intended recipient, or responsible for 
delivering the information to the intended recipient, you are hereby notified 
that any dissemination, distribution, printing or copying of this e-mail and 
any attachments is strictly prohibited. If this e-mail and any attachments were 
received in error, please notify the sender by reply e-mail and delete the 
original message.

Re: Poor performance with stable/13 and Mellanox ConnectX-6 (mlx5)

2022-06-13 Thread Mike Jakubik
Hi,



No, I do not see any retransmission in Linux (see the forum URL for 
screenshots) so I do not think this is a hardware issue. I don't think these 
cards have flow control on them. I also do not see any errors, drops, or 
collisions in netstat -i. It's like the network stack doesnt know what do do 
initially, it seems to sometimes even out after a few seconds, see below. In 
Linux I get instant 14.6Gb and it stays that way, with zero retries.



[root@db-02 ~]# iperf3 -i 1 -t 30 -c db-01 

Connecting to host db-01, port 5201

[  5] local 192.168.10.31 port 42022 connected to 192.168.10.30 port 5201

[ ID] Interval   Transfer Bitrate Retr  Cwnd

[  5]   0.00-1.00   sec   623 MBytes  5.23 Gbits/sec  171    640 KBytes

[  5]   1.00-2.00   sec   613 MBytes  5.14 Gbits/sec  135    543 KBytes

[  5]   2.00-3.00   sec   662 MBytes  5.55 Gbits/sec  107    471 KBytes

[  5]   3.00-4.00   sec   718 MBytes  6.02 Gbits/sec   32    350 KBytes

[  5]   4.00-5.00   sec   709 MBytes  5.95 Gbits/sec   28    685 KBytes

[  5]   5.00-6.00   sec   713 MBytes  5.98 Gbits/sec   39    603 KBytes

[  5]   6.00-7.00   sec   704 MBytes  5.91 Gbits/sec   95    540 KBytes

[  5]   7.00-8.00   sec   716 MBytes  6.01 Gbits/sec   49    466 KBytes

[  5]   8.00-9.00   sec   722 MBytes  6.06 Gbits/sec  132    752 KBytes

[  5]   9.00-10.00  sec   720 MBytes  6.04 Gbits/sec   19    649 KBytes

[  5]  10.00-11.00  sec   720 MBytes  6.04 Gbits/sec  267    474 KBytes

[  5]  11.00-12.00  sec   675 MBytes  5.65 Gbits/sec  138   1.16 MBytes

[  5]  12.00-13.00  sec  1.04 GBytes  8.96 Gbits/sec  118   1.22 MBytes

[  5]  13.00-14.00  sec  1.29 GBytes  11.1 Gbits/sec    0   1.29 MBytes

[  5]  14.00-15.00  sec  1.29 GBytes  11.1 Gbits/sec    0   1.31 MBytes

[  5]  15.00-16.00  sec  1.29 GBytes  11.1 Gbits/sec    0   1.34 MBytes

[  5]  16.00-17.00  sec  1.29 GBytes  11.1 Gbits/sec    0   1.34 MBytes

[  5]  17.00-18.00  sec  1.29 GBytes  11.1 Gbits/sec    0   1.36 MBytes

[  5]  18.00-19.00  sec  1.29 GBytes  11.1 Gbits/sec    0   1.36 MBytes

[  5]  19.00-20.00  sec  1.29 GBytes  11.1 Gbits/sec    0   1.37 MBytes

[  5]  20.00-21.00  sec  1.29 GBytes  11.1 Gbits/sec    0   1.39 MBytes

[  5]  21.00-22.00  sec  1.29 GBytes  11.1 Gbits/sec    0   1.40 MBytes

[  5]  22.00-23.00  sec  1.29 GBytes  11.1 Gbits/sec    0   1.41 MBytes

[  5]  23.00-24.00  sec  1.29 GBytes  11.1 Gbits/sec    0   1.41 MBytes

[  5]  24.00-25.00  sec  1.29 GBytes  11.1 Gbits/sec    0   1.42 MBytes

[  5]  25.00-26.00  sec  1.29 GBytes  11.1 Gbits/sec    0   1.44 MBytes

[  5]  26.00-27.00  sec  1.29 GBytes  11.1 Gbits/sec    0   1.44 MBytes

[  5]  27.00-28.00  sec  1.29 GBytes  11.1 Gbits/sec    0   1.44 MBytes

[  5]  28.00-29.00  sec  1.29 GBytes  11.1 Gbits/sec    0   1.45 MBytes

[  5]  29.00-30.00  sec  1.29 GBytes  11.1 Gbits/sec    0   1.46 MBytes

- - - - - - - - - - - - - - - - - - - - - - - - -

[ ID] Interval   Transfer Bitrate Retr

[  5]   0.00-30.00  sec  31.1 GBytes  8.91 Gbits/sec  1330 sender

[  5]   0.00-30.00  sec  31.1 GBytes  8.91 Gbits/sec  receiver





Thanks.







 On Mon, 13 Jun 2022 14:41:05 -0400 Santiago Martinez 
<mailto:s...@codenetworks.net> wrote ----








Mike Jakubik

https://www.swiftsmsgateway.com/



Disclaimer: This e-mail and any attachments are intended only for the use of 
the addressee(s) and may contain information that is privileged or 
confidential. If you are not the intended recipient, or responsible for 
delivering the information to the intended recipient, you are hereby notified 
that any dissemination, distribution, printing or copying of this e-mail and 
any attachments is strictly prohibited. If this e-mail and any attachments were 
received in error, please notify the sender by reply e-mail and delete the 
original message.








Hi there, there are a lot of re-transmission there... do you see
  the same with Linux? 

Are you seeing any drops or error counters increasing on the
  switch side? 

Have you check the sysctl for the card, I never used mellanox,
  but im pretty sure people here can help you.

You can also give it a try disabling control flow.

Hope it helps.

Santi



On 6/13/22 20:25, Mike Jakubik wrote:




Hello,



I have two new servers with a Mellnox ConnectX-6 card
  linked at 25Gb/s, however, I am unable to get much more
  than 6Gb/s when testing with iperf3.



The servers are Lenovo SR665 (2 x AMD EPYC 7443 24-Core
  Processor, 256 GB RAM, Mellanox ConnectX-6 Lx 10/25GbE
  SFP28 2-port OCP Ethernet Adapter)



They are connected to a Dell N3224PX-ON switch. Both
  servers are idle and not in use, with a fresh install
  of stable/13-ebea872f8, nothing running on them except
  ssh, sendmail, etc.



When i test with iperf3 I am unable to get a higher avg
  than about 6Gb/s. I have tried 

Re: Poor performance with stable/13 and Mellanox ConnectX-6 (mlx5)

2022-06-14 Thread Mike Jakubik
Yes, it is the default of 1500. If I set it to 9000 I get some bizarre network 
behavior.





 On Tue, 14 Jun 2022 09:45:10 -0400 Andrey V. Elsukov 
<mailto:bu7c...@yandex.ru> wrote 






Hi, 
 
Do you have the same MTU size on linux machine? 
 






Mike Jakubik

https://www.swiftsmsgateway.com/



Disclaimer: This e-mail and any attachments are intended only for the use of 
the addressee(s) and may contain information that is privileged or 
confidential. If you are not the intended recipient, or responsible for 
delivering the information to the intended recipient, you are hereby notified 
that any dissemination, distribution, printing or copying of this e-mail and 
any attachments is strictly prohibited. If this e-mail and any attachments were 
received in error, please notify the sender by reply e-mail and delete the 
original message.

Re: Poor performance with stable/13 and Mellanox ConnectX-6 (mlx5)

2022-06-14 Thread Mike Jakubik
  14.7 Gbits/sec   97   1.01 MBytes

[  5]  28.00-29.00  sec   719 MBytes  6.03 Gbits/sec    0   1.01 MBytes

[  5]  29.00-30.00  sec   727 MBytes  6.10 Gbits/sec    0   1.01 MBytes

- - - - - - - - - - - - - - - - - - - - - - - - -

[ ID] Interval   Transfer Bitrate Retr

[  5]   0.00-30.00  sec  49.3 GBytes  14.1 Gbits/sec  164 sender

[  5]   0.00-30.00  sec  49.3 GBytes  14.1 Gbits/sec  receiver







The CPU usage is rather low, the NIC is in the OCP port, so im sure thats 
designed accordingly and the NIC is bound to numa0.



CPU:  0.0% user,  0.0% nice,  0.5% system,  0.7% interrupt, 98.8% idle



  PID USERNAME    THR PRI NICE   SIZE    RES STATE    C   TIME    WCPU COMMAND 

2195 root  1  52    0    17M  6884K select  83   0:14  27.99% iperf3



# vmstat -i -w1|grep mlx5

irq671: mlx5_core0 49969  47008



(this drops to about 14k with HW LRO enabled)



The dump is rather large, i dont think i can attach it in the mailing list, but 
if you wish to see it i can upload it somewhere.



Thank You.








 On Mon, 13 Jun 2022 16:42:59 -0400 Hans Petter Selasky  
wrote 




Some ideas: 
 
Try to disable "rxpause,txpause" when setting the media. 
 
Keep HW LRO off for now, it doesn't work for large number of connections. 
 
What is the CPU usage during test? Is iperf3 running on a CPU-core which 
has direct access to the NIC's numa domain? 
 
Is the NIC installed in the "correct" PCI high-performance slot? 
 
There are some sysctl knobs which may tell where the problem is, if it's 
PCI backpressure or something else. 
 
sysctl -a | grep diag_pci_enable 
sysctl -a | grep diag_general_enable 
 
Set these two to 1, then run some traffic and dump all mce sysctls: 
 
sysctl -a | grep mce > dump.txt 
 
--HPS 
 





Mike Jakubik

https://www.swiftsmsgateway.com/



Disclaimer: This e-mail and any attachments are intended only for the use of 
the addressee(s) and may contain information that is privileged or 
confidential. If you are not the intended recipient, or responsible for 
delivering the information to the intended recipient, you are hereby notified 
that any dissemination, distribution, printing or copying of this e-mail and 
any attachments is strictly prohibited. If this e-mail and any attachments were 
received in error, please notify the sender by reply e-mail and delete the 
original message.

Re: Poor performance with stable/13 and Mellanox ConnectX-6 (mlx5)

2022-06-14 Thread Mike Jakubik
]  26.00-27.00  sec  2.14 GBytes  18.3 Gbits/sec  365    593 KBytes

[  5]  27.00-28.00  sec  1.56 GBytes  13.4 Gbits/sec   33   1.23 MBytes

[  5]  28.00-29.00  sec  1.67 GBytes  14.4 Gbits/sec    0   1.61 MBytes

[  5]  29.00-30.00  sec  1.65 GBytes  14.2 Gbits/sec   35   1.14 MBytes

- - - - - - - - - - - - - - - - - - - - - - - - -

[ ID] Interval   Transfer Bitrate Retr

[  5]   0.00-30.00  sec  29.4 GBytes  8.42 Gbits/sec  3863 sender

[  5]   0.00-30.00  sec  29.4 GBytes  8.42 Gbits/sec  receiver











 On Tue, 14 Jun 2022 10:21:51 -0400 Mike Jakubik 
 wrote 



Disabling rx/tx pause seems to produce higher peaks.



[root@db-02 ~]# iperf3 -i 1 -t 30 -c db-01 

Connecting to host db-01, port 5201

[  5] local 192.168.10.31 port 10146 connected to 192.168.10.30 port 5201

[ ID] Interval   Transfer Bitrate Retr  Cwnd

[  5]   0.00-1.00   sec  1.89 GBytes  16.2 Gbits/sec    0   1.10 MBytes

[  5]   1.00-2.00   sec  1.86 GBytes  15.9 Gbits/sec    0   1.10 MBytes

[  5]   2.00-3.00   sec  2.05 GBytes  17.6 Gbits/sec    0   1.11 MBytes

[  5]   3.00-4.00   sec   859 MBytes  7.20 Gbits/sec   21    938 KBytes

[  5]   4.00-5.00   sec   652 MBytes  5.47 Gbits/sec    0   1.01 MBytes

[  5]   5.00-6.00   sec   659 MBytes  5.53 Gbits/sec    0   1.03 MBytes

[  5]   6.00-7.00   sec   666 MBytes  5.59 Gbits/sec    0   1.05 MBytes

[  5]   7.00-8.00   sec   657 MBytes  5.51 Gbits/sec   98    989 KBytes

[  5]   8.00-9.00   sec   665 MBytes  5.58 Gbits/sec  139    712 KBytes

[  5]   9.00-10.00  sec   647 MBytes  5.43 Gbits/sec    0   1.02 MBytes

[  5]  10.00-11.00  sec   650 MBytes  5.45 Gbits/sec    4    606 KBytes

[  5]  11.00-12.00  sec  1.53 GBytes  13.1 Gbits/sec  358   1.07 MBytes

[  5]  12.00-13.00  sec  2.10 GBytes  18.1 Gbits/sec  162    837 KBytes

[  5]  13.00-14.00  sec  2.09 GBytes  18.0 Gbits/sec  332    838 KBytes

[  5]  14.00-15.00  sec  2.43 GBytes  20.9 Gbits/sec  639    747 KBytes

[  5]  15.00-16.00  sec  2.38 GBytes  20.4 Gbits/sec  612   1.02 MBytes

[  5]  16.00-17.00  sec  2.25 GBytes  19.3 Gbits/sec  535   1.24 MBytes

[  5]  17.00-18.00  sec  2.52 GBytes  21.6 Gbits/sec  818    423 KBytes

[  5]  18.00-19.00  sec  2.29 GBytes  19.7 Gbits/sec  218    444 KBytes

[  5]  19.00-20.00  sec  2.29 GBytes  19.7 Gbits/sec  114    859 KBytes

[  5]  20.00-21.00  sec  1.65 GBytes  14.1 Gbits/sec  100    541 KBytes

[  5]  21.00-22.00  sec  1.01 GBytes  8.67 Gbits/sec    0    639 KBytes

[  5]  22.00-23.00  sec   625 MBytes  5.24 Gbits/sec    0    648 KBytes

[  5]  23.00-24.00  sec   630 MBytes  5.28 Gbits/sec    0    648 KBytes

[  5]  24.00-25.00  sec  1.56 GBytes  13.4 Gbits/sec    0    702 KBytes

[  5]  25.00-26.00  sec  1.78 GBytes  15.3 Gbits/sec  118    406 KBytes

[  5]  26.00-27.00  sec  1.37 GBytes  11.8 Gbits/sec  105    890 KBytes

[  5]  27.00-28.00  sec  1.82 GBytes  15.6 Gbits/sec  104    963 KBytes

[  5]  28.00-29.00  sec  1.68 GBytes  14.4 Gbits/sec    0   1.20 MBytes

[  5]  29.00-30.00  sec  1.67 GBytes  14.4 Gbits/sec    0   1.38 MBytes

- - - - - - - - - - - - - - - - - - - - - - - - - 

[ ID] Interval   Transfer Bitrate Retr

[  5]   0.00-30.00  sec  44.8 GBytes  12.8 Gbits/sec  4477 sender

[  5]   0.00-30.01  sec  44.8 GBytes  12.8 Gbits/sec  receiver



After a few runs:



[root@db-02 ~]# iperf3 -i 1 -t 30 -c db-01 

Connecting to host db-01, port 5201

[  5] local 192.168.10.31 port 52152 connected to 192.168.10.30 port 5201

[ ID] Interval   Transfer Bitrate Retr  Cwnd

[  5]   0.00-1.00   sec  1.91 GBytes  16.4 Gbits/sec   67    606 KBytes

[  5]   1.00-2.00   sec  1.78 GBytes  15.3 Gbits/sec    0   1.07 MBytes

[  5]   2.00-3.00   sec  1.60 GBytes  13.7 Gbits/sec    0   1.54 MBytes

[  5]   3.00-4.00   sec  1.61 GBytes  13.8 Gbits/sec    0   1.61 MBytes

[  5]   4.00-5.00   sec  1.66 GBytes  14.3 Gbits/sec    0   1.61 MBytes

[  5]   5.00-6.00   sec  1.67 GBytes  14.3 Gbits/sec    0   1.61 MBytes

[  5]   6.00-7.00   sec  1.65 GBytes  14.1 Gbits/sec    0   1.61 MBytes

[  5]   7.00-8.00   sec  1.70 GBytes  14.6 Gbits/sec    0   1.61 MBytes

[  5]   8.00-9.00   sec  1.72 GBytes  14.8 Gbits/sec    0   1.61 MBytes

[  5]   9.00-10.00  sec  1.85 GBytes  15.9 Gbits/sec    0   1.61 MBytes

[  5]  10.00-11.00  sec  1.81 GBytes  15.5 Gbits/sec    0   1.61 MBytes

[  5]  11.00-12.00  sec  1.67 GBytes  14.3 Gbits/sec    0   1.61 MBytes

[  5]  12.00-13.00  sec  1.66 GBytes  14.3 Gbits/sec    0   1.61 MBytes

[  5]  13.00-14.00  sec  1.83 GBytes  15.7 Gbits/sec    0   1.61 MBytes

[  5]  14.00-15.00  sec  1.18 GBytes  10.1 Gbits/sec    0    794 KBytes

[  5]  15.00-16.00  sec  1.67 GBytes  14.4 Gbits/sec    0   1.60 MBytes

[  5]  16.00-17.00  sec  1.73 GBytes  14.8 Gbits/sec    0   1.60 MBytes

[  5]  17.00-18.00  sec  1.73 GBytes  14.9 Gbits/sec    0   1.60 MBytes

[  5]  18.00-19.00  sec  1.83 GBytes  15.7 Gbits/sec    0   1.61

Re: Poor performance with stable/13 and Mellanox ConnectX-6 (mlx5)

2022-06-16 Thread Mike Jakubik
After multiple tests and tweaks i believe the issue is not with the HW or Numa 
related (Infinity fabric should do around 32GB) but rather with FreeBSD TCP/IP 
stack. It's like it cant figure itself out properly for the speed that the HW 
can do, i keep getting widely varying results when testing. Below is an example 
of two tests, with about a 15 second break in between the two.



[root@db-02 ~]# iperf3 -i 1 -t 30 -c db-01 

Connecting to host db-01, port 5201

[  5] local 192.168.10.31 port 49155 connected to 192.168.10.30 port 5201

[ ID] Interval   Transfer Bitrate Retr  Cwnd

[  5]   0.00-1.00   sec   991 MBytes  8.32 Gbits/sec  268    579 KBytes

[  5]   1.00-2.00   sec   945 MBytes  7.93 Gbits/sec  369    777 KBytes

[  5]   2.00-3.00   sec   793 MBytes  6.65 Gbits/sec   60   1.03 MBytes

[  5]   3.00-4.00   sec   666 MBytes  5.59 Gbits/sec  203    976 KBytes

[  5]   4.00-5.01   sec   575 MBytes  4.78 Gbits/sec  202   1.13 MBytes

[  5]   5.01-6.00   sec   169 MBytes  1.43 Gbits/sec  134    699 KBytes

[  5]   6.00-7.00   sec  1.21 GBytes  10.4 Gbits/sec  383   1.08 MBytes

[  5]   7.00-8.00   sec  1.21 GBytes  10.4 Gbits/sec    0   1.16 MBytes

[  5]   8.00-9.00   sec  1.32 GBytes  11.3 Gbits/sec  124    780 KBytes

[  5]   9.00-10.00  sec   690 MBytes  5.79 Gbits/sec  316    605 KBytes

[  5]  10.00-11.00  sec   685 MBytes  5.75 Gbits/sec   97    854 KBytes

[  5]  11.00-12.00  sec  1.08 GBytes  9.30 Gbits/sec  383    538 KBytes

[  5]  12.00-13.00  sec   682 MBytes  5.72 Gbits/sec   88    870 KBytes

[  5]  13.00-14.00  sec   678 MBytes  5.69 Gbits/sec  123    964 KBytes

[  5]  14.00-15.00  sec   670 MBytes  5.62 Gbits/sec  290    763 KBytes

[  5]  15.00-16.00  sec  1.01 GBytes  8.71 Gbits/sec  228   1.08 MBytes

[  5]  16.00-17.00  sec   886 MBytes  7.44 Gbits/sec  118    615 KBytes

[  5]  17.00-18.00  sec   734 MBytes  6.16 Gbits/sec  291    902 KBytes

[  5]  18.00-19.00  sec  1.04 GBytes  8.96 Gbits/sec  212    323 KBytes

[  5]  19.00-20.00  sec   710 MBytes  5.96 Gbits/sec  193    547 KBytes

[  5]  20.00-21.00  sec   693 MBytes  5.82 Gbits/sec  370    942 KBytes

[  5]  21.00-22.00  sec   704 MBytes  5.91 Gbits/sec   80   1022 KBytes

[  5]  22.00-23.00  sec  1.26 GBytes  10.8 Gbits/sec  262    965 KBytes

[  5]  23.00-24.00  sec   828 MBytes  6.94 Gbits/sec  202    763 KBytes

[  5]  24.00-25.00  sec   774 MBytes  6.49 Gbits/sec  227    581 KBytes

[  5]  25.00-26.00  sec   734 MBytes  6.15 Gbits/sec  256    664 KBytes

[  5]  26.00-27.00  sec   753 MBytes  6.32 Gbits/sec  331    540 KBytes

[  5]  27.00-28.00  sec   764 MBytes  6.41 Gbits/sec  298    823 KBytes

[  5]  28.00-29.00  sec   757 MBytes  6.35 Gbits/sec  123    850 KBytes

[  5]  29.00-30.00  sec   754 MBytes  6.32 Gbits/sec   74    970 KBytes

- - - - - - - - - - - - - - - - - - - - - - - - -

[ ID] Interval   Transfer Bitrate Retr

[  5]   0.00-30.00  sec  24.4 GBytes  6.98 Gbits/sec  6305 sender

[  5]   0.00-30.00  sec  24.4 GBytes  6.98 Gbits/sec  receiver



iperf Done.



[root@db-02 ~]# iperf3 -i 1 -t 30 -c db-01

Connecting to host db-01, port 5201

[  5] local 192.168.10.31 port 25061 connected to 192.168.10.30 port 5201

[ ID] Interval   Transfer Bitrate Retr  Cwnd

[  5]   0.00-1.00   sec  1.81 GBytes  15.5 Gbits/sec    0   1.11 MBytes

[  5]   1.00-2.00   sec  1.83 GBytes  15.7 Gbits/sec    0   1.11 MBytes

[  5]   2.00-3.00   sec  1.98 GBytes  17.0 Gbits/sec    0   1.11 MBytes

[  5]   3.00-4.00   sec  2.11 GBytes  18.1 Gbits/sec    0   1.11 MBytes

[  5]   4.00-5.00   sec  2.12 GBytes  18.2 Gbits/sec    0   1.11 MBytes

[  5]   5.00-6.00   sec  2.16 GBytes  18.5 Gbits/sec    0   1.11 MBytes

[  5]   6.00-7.00   sec  1.90 GBytes  16.3 Gbits/sec    0   1.12 MBytes

[  5]   7.00-8.02   sec  1.28 GBytes  10.8 Gbits/sec    0   1.12 MBytes

[  5]   8.02-9.00   sec  1.83 GBytes  16.0 Gbits/sec    0   1.17 MBytes

[  5]   9.00-10.00  sec  1.91 GBytes  16.4 Gbits/sec    0   1.20 MBytes

[  5]  10.00-11.00  sec  1.79 GBytes  15.3 Gbits/sec    0   1.60 MBytes

[  5]  11.00-12.00  sec  1.77 GBytes  15.2 Gbits/sec    0   1.60 MBytes

[  5]  12.00-13.00  sec  1.69 GBytes  14.5 Gbits/sec    0   1.61 MBytes

[  5]  13.00-14.00  sec  1.57 GBytes  13.5 Gbits/sec    0   1.61 MBytes

[  5]  14.00-15.00  sec  1.60 GBytes  13.8 Gbits/sec    0   1.61 MBytes

[  5]  15.00-16.00  sec  1.89 GBytes  16.2 Gbits/sec    0   1.61 MBytes

[  5]  16.00-17.00  sec  1.76 GBytes  15.1 Gbits/sec    0   1.61 MBytes

[  5]  17.00-18.00  sec  1.93 GBytes  16.6 Gbits/sec    0   1.61 MBytes

[  5]  18.00-19.00  sec  1.77 GBytes  15.2 Gbits/sec    0   1.61 MBytes

[  5]  19.00-20.00  sec  1.68 GBytes  14.5 Gbits/sec    0   1.61 MBytes

[  5]  20.00-21.00  sec  1.66 GBytes  14.3 Gbits/sec    0   1.61 MBytes

[  5]  21.00-22.00  sec  1.75 GBytes  15.1 Gbits/sec    0   1.61 MBytes

[  5]  22.00-23.00  sec  1.76 GBytes  15.1 Gbits/sec    0   1.61 MBytes

[  5]  23.00

Re: Poor performance with stable/13 and Mellanox ConnectX-6 (mlx5)

2022-06-16 Thread Mike Jakubik
 KBytes

[  5]  20.00-21.00  sec  2.35 GBytes  20.2 Gbits/sec  360    439 KBytes

[  5]  21.00-22.00  sec  2.41 GBytes  20.7 Gbits/sec  525    580 KBytes

[  5]  22.00-23.00  sec  2.43 GBytes  20.9 Gbits/sec  510    397 KBytes

[  5]  23.00-24.00  sec  2.38 GBytes  20.4 Gbits/sec  532    533 KBytes

[  5]  24.00-25.00  sec  2.37 GBytes  20.4 Gbits/sec  344    547 KBytes

[  5]  25.00-26.00  sec  2.36 GBytes  20.2 Gbits/sec  354    389 KBytes

[  5]  26.00-27.00  sec  2.30 GBytes  19.8 Gbits/sec  165    592 KBytes

[  5]  27.00-28.00  sec  2.30 GBytes  19.8 Gbits/sec  173    584 KBytes

[  5]  28.00-29.00  sec  2.27 GBytes  19.5 Gbits/sec    0    701 KBytes

[  5]  29.00-30.00  sec  2.29 GBytes  19.7 Gbits/sec    0    790 KBytes

- - - - - - - - - - - - - - - - - - - - - - - - -

[ ID] Interval   Transfer Bitrate Retr

[  5]   0.00-30.00  sec  67.4 GBytes  19.3 Gbits/sec  6557 sender

[  5]   0.00-30.00  sec  67.4 GBytes  19.3 Gbits/sec  receiver



iperf Done.





Thank You!






 On Thu, 16 Jun 2022 17:00:25 -0400 Alexander V. Chernikov 
 wrote 





> On 16 Jun 2022, at 21:48, Mike Jakubik 
> <mailto:mike.jaku...@swiftsmsgateway.com> wrote: 
> 
> After multiple tests and tweaks i believe the issue is not with the HW or 
> Numa related (Infinity fabric should do around 32GB) but rather with FreeBSD 
> TCP/IP stack. It's like it cant figure itself out properly for the speed that 
> the HW can do, i keep getting widely varying results when testing. Below is 
> an example of two tests, with about a 15 second break in between the two. 
Does pinning iperf to a specific CPU core (or range) address the variety part? 
e.g. cpuset -l 1 perf … 
The output you shared above shows CPU#83 as the core iperf is running on. Just 
wondering whether the scheduler migrates iperf too often, thrashing the caches.

Re: missing SYN/ACK for inbound TCP solved by altering broadcast address - why?

2022-06-27 Thread Mike Karels

Responding to parts of two emails:

On 27 Jun 2022, at 7:41, Marek Zarychta wrote:


W dniu 27.06.2022 o 13:44, Dave Cottlehuber pisze:

I've found a workaround for this issue, but don't understand why this
occurs. Reading RFC1122 has left me none the wiser. What am I 
missing?

Is this a Linuxism or simple a standardisation loophole?


It has been standardized in RFC3021 over twenty years ago. FreeBSD 
ifconfig(8) supports /31 netmask for a long time and the broadcast 
address is correctly assigned in this case (255.255.255.255). Either 
dhcp-options(5) "option broadcast-address" is missing on the DHCP 
server or our dhclient(8) is misbehaving or maybe the Linux client is 
better in figuring out the right broadcast address.


Looks like RFC3021 only says it applies to point-to-point interfaces,
and doesn’t consider other types.  I don’t remember any code in 
ifconfig

to special-case this, although the kernel will default the broadcast
correctly in this case (to the all-1’s address; I guess that’s an 
extension).
That means something is giving the broadcast address to ifconfig 
explicitly.



## Problem

- on 13.1-R, dhclient-set config works for all UDP, & outbound TCP
- but inbound TCP connections send no SYN/ACK at all back
- on Linux Ubuntu 22.04 & others, the DHCP supplied IP config
   works as expected


Do you know if this worked with 13.0?  (I made a change in 13.1,
but don’t quite see how it would cause this situation to change.)


failing FreeBSD config from dhclient:
   inet 147.75.93.61 netmask 0xfffe broadcast 147.75.93.60


This is odd.  I don’t know why the broadcast address would be host 0
on that network, but note that it is the same as the router address.
That is probably the root of the problem.  I don’t see a broadcast
address in the lease below, so maybe dhclient is confused.  The
default broadcast would be host -1, but of course that is the host
itself.


working Linux config (note broadcast)
   inet 147.75.93.61 netmask 0xfffe broadcast 255.255.255.254


That’s an odd choice of broadcast, but it doesn’t really matter 
here.



- full details below (dhcp lease, ifconfigs etc)

I worked around this by forcing broadcast-address in dhclient.conf:

## /etc/dhclient.conf
interface "ice0" {
   supersede broadcast-address 255.255.255.255;
}
# repeat for other ifaces as required

Which is ~ok~ for the moment, but I'd like to understand why this
occurs, and fix it properly. Either at DHCPD end, or FreeBSD
config.



# Further details

- Ubuntu 22.04 from vendor
- FreeBSD 13.1-RELEASE amd64 vanilla install
- 4x ice(4) NICs (Intel E810) and 2x (unused) ix (igxbe)
- 2x of the ice(4) are bonded link aggregation
- dhclient only used to attach to 1 nic, ignoring FreeBSD side of 
bonding



## Linux ip addr

# ip addr
8: bond0:  mtu 1500 qdisc 
noqueue state UP group default qlen 1000

 link/ether b4:96:91:d9:99:20 brd ff:ff:ff:ff:ff:ff
 inet 147.75.92.187/31 brd 255.255.255.255 scope global bond0
...

## FreeBSD ifconfig

# ifconfig ice0
ice0: flags=8863 metric 0 mtu 
1500

options=4e10438
 ether b4:96:91:d9:9b:48
 inet 147.75.93.61 netmask 0xfffe broadcast 147.75.93.60
 media: Ethernet autoselect (25G-AUI )
 status: active
 nd6 options=29
...
root@metalBSD:~ # netstat -4rn
Routing tables

Internet:
DestinationGatewayFlags Netif Expire
default147.75.93.60   UGSice0
127.0.0.1  link#7 UH  lo0
147.75.93.60/31link#3 U  ice0
147.75.93.61   link#3 UHS lo0

root@metalBSD:~ # cat /var/db/dhclient.leases.ice0

- note no broadcast-address provided
- Linux & FreeBSD evidently derive it differently

lease {
   interface "ice0";
   fixed-address 147.75.93.61;
   option subnet-mask 255.255.255.254;
   option routers 147.75.93.60;
   option domain-name-servers 147.75.207.207,147.75.207.208;
   option host-name "intransigent09";
   option dhcp-lease-time 172800;
   option dhcp-message-type 5;
   option dhcp-server-identifier 139.178.78.140;
   renew 1 2022/6/27 18:40:06;
   rebind 2 2022/6/28 12:40:06;
   expire 2 2022/6/28 18:40:06;
}

A+
Dave




--
Marek Zarychta




Re: Poor performance with stable/13 and Mellanox ConnectX-6 (mlx5)

2022-06-28 Thread Mike Jakubik
Hi,



So basically this is my conclusion, if I cpuset iperf on at least the receiving 
end i get great performance. Anything outside of that is random. I've tried 
just about every network tuning knob in FreeBSD as well as what Mellanox 
recommends in their driver manual, none of these make any significant impact, 
sometimes they even reduce performance. So I m hoping when this goes into 
production the scheduler will be sane enough to do the same, but we shall see.



Thanks.






 On Fri, 17 Jun 2022 11:03:04 -0400 Dave Cottlehuber  
wrote ---



On Fri, 17 Jun 2022, at 02:38, Mike Jakubik wrote: 
> Hi, 
> 
> I believe you hit the nail on the head! I am now getting consistent 
> high speeds, even higher than on Linux! Is this a problem with the 
> scheduler? Should someone in that area of expertise be made aware of 
> this? More importantly i guess, would this affect real world 
> performance, these servers will be running RabbitMQ (it uses quite a 
> bit of bandwidth) and PostgreSQL w/ replication. 
 
pinning cores for unimpeded access is very common for high performance systems. 
Do this both for the nics and also your apps. Be mindful of the NUMA topooogy. 
 
You should look into both the  erlang scheduler flags for core pinning, and 
also ensuring that your erlang processes have unimpeded access to their own 
cores too. A reasonable approach is to make a simple cowboy or Phoenix app and 
hammer it with wrk or similar load tool to get a feel for things, and then 
profile and tune your own app based on those specific results. 
 
For rabbit there is an excellent load testing tool from the pivotal team if you 
don’t have suitable load generators yourselves. 
 
Tsung is an excellent tool if you put in the work to craft something specific 
for your use case. 
 
Please post back to the list with your specific findings and nic/ tcp tunables, 
these are very helpful for the next person! 
 
Dave 
 





Mike Jakubik

https://www.swiftsmsgateway.com/



Disclaimer: This e-mail and any attachments are intended only for the use of 
the addressee(s) and may contain information that is privileged or 
confidential. If you are not the intended recipient, or responsible for 
delivering the information to the intended recipient, you are hereby notified 
that any dissemination, distribution, printing or copying of this e-mail and 
any attachments is strictly prohibited. If this e-mail and any attachments were 
received in error, please notify the sender by reply e-mail and delete the 
original message.

Re: Netstat -i 5-character interface name length?

2022-06-29 Thread mike tancsa

On 6/29/2022 10:56 AM, Chris Ross wrote:
Hello folks.  I just noticed something that I’m sure has been true 
forever, but I checked and it’s still true on my 12.3-STABLE system.


One of the first local mods I do is alias netstat to netstat -W for this 
reason. e.g.

alias netstat   netstat -W

in /etc/csh.cshrc

    ---Mike



Re: Netstat -i 5-character interface name length?

2022-07-02 Thread Mike Karels

On 1 Jul 2022, at 4:11, Ronald Klop wrote:


Van: George Michaelson 
Datum: vrijdag, 1 juli 2022 00:50
Aan: "Rodney W. Grimes" 
CC: mike tancsa , Chris Ross 
, freebsd-net@freebsd.org

Onderwerp: Re: Netstat -i 5-character interface name length?


Is there a reason (avoid bikeshedding) the field width can't be
increased to allow the bgeXhexIsVeryLong0 names to work?



I agree. I hope POLA is more leaning towards "why does netstat not 
print the interface name correctly?" than "my 15 year old script 
parsing the output of netstat doesn't understand strings longer than 5 
chars".

$ netstat -i | grep Link
NameMtu Network   Address  Ipkts Ierrs Idrop
Opkts Oerrs  Coll
genet  1500   dc:a6:32:da:f4:3b 62095311 0 0 
105591894 0 0
lo0   16384   lo0  1 0 0   
 1 0 0
bridg  1500   58:9c:fc:00:3e:aa 18616989 0 0 
18652615 8 0
vlan3  1500   dc:a6:32:da:f4:3b  9673278 0 0  
5695824 8 0
epair  1500   02:c8:49:24:bd:0a  3041667 0 0  
446700617 0
epair  1500   02:d5:f0:fe:9e:0a  1529717 0 0  
193217017 0
epair  1500   02:96:17:58:ce:0a  2384154 0 0  
474068317 0
epair  1500  02:b2:7f:d6:da:0a 8746 0 0
2212522 0
epair  1500  02:81:38:75:d1:0a87264 0 0   
17853521 0
epair  1500  02:ad:f2:49:60:0a78055 0 0   
16025221 0
epair  1500  02:0d:07:81:b2:0a  1814108 0 0  
145588916 0


So all "default" interface names do not fit. I don't like the solution 
of "rename all your interfaces" as I think the out-of-the-box 
experience can be made better.
I'll vote for enabling -W by default and add an option for backwards 
compatibility.


-W makes the address field unnecessarily wide for most users, at least
if IPv6 is enabled.  However, netstat already has code to figure out
the required field width to avoid truncating names, which it uses only
for -W.  I’d suggest that this should be done unconditionally, as the
output makes no sense if names are ambiguous.  This is a trivial change
(I just tested it).  Any comments or objections?  I’ll put it in
review.

Mike


Regards,
Ronald.



 >

I'm not saying "you can alias around this" is bad, but I sense we're
walking into a world which is where Linux is, with every physical
device called eth0/1/2 and then "which" device is eth0 becomes a
question..

On Fri, Jul 1, 2022 at 1:17 AM Rodney W. Grimes
 wrote:
>
> [ Charset UTF-8 unsupported, converting... ]
> > On 6/29/2022 10:56 AM, Chris Ross wrote:
> > > Hello folks. ?I just noticed something that I?m sure has been 
true
> > > forever, but I checked and it?s still true on my 12.3-STABLE 
system.

> > >
> > One of the first local mods I do is alias netstat to netstat -W 
for this

> > reason. e.g.
> > alias netstat?? netstat -W
> >
> > in /etc/csh.cshrc
>
> That only fixes it for your interactive csh processes, the
> original poster had specifically mentioned output from
> periodic scrips, aka daily iirc.
>
> One thing that can be done to mitigate the long vlan
> dev name (imho the vlan driver should of just named
> itself much short, like "vl", as most network devices
> are 2 litter names anyway) is to use the "name" option
> of ifconfig to give them a better name than the default.
>
> ifconfig vlan2 create vlandev em0 vlan 2 name v2
>
> --
> Rod Grimes 
rgri...@freebsd.org

>







Re: Netstat -i 5-character interface name length?

2022-07-03 Thread Mike Karels

On 2 Jul 2022, at 10:11, Mike Karels wrote:


On 1 Jul 2022, at 4:11, Ronald Klop wrote:


Van: George Michaelson 
Datum: vrijdag, 1 juli 2022 00:50
Aan: "Rodney W. Grimes" 
CC: mike tancsa , Chris Ross 
, freebsd-net@freebsd.org

Onderwerp: Re: Netstat -i 5-character interface name length?


Is there a reason (avoid bikeshedding) the field width can't be
increased to allow the bgeXhexIsVeryLong0 names to work?



I agree. I hope POLA is more leaning towards "why does netstat not 
print the interface name correctly?" than "my 15 year old script 
parsing the output of netstat doesn't understand strings longer than 
5 chars".

$ netstat -i | grep Link
NameMtu Network   Address  Ipkts Ierrs Idrop
Opkts Oerrs  Coll
genet  1500   dc:a6:32:da:f4:3b 62095311 0 0 
105591894 0 0
lo0   16384   lo0  1 0 0  
  1 0 0
bridg  1500   58:9c:fc:00:3e:aa 18616989 0 0 
18652615 8 0
vlan3  1500   dc:a6:32:da:f4:3b  9673278 0 0  
5695824 8 0
epair  1500   02:c8:49:24:bd:0a  3041667 0 0  
446700617 0
epair  1500   02:d5:f0:fe:9e:0a  1529717 0 0  
193217017 0
epair  1500   02:96:17:58:ce:0a  2384154 0 0  
474068317 0
epair  1500  02:b2:7f:d6:da:0a 8746 0 0
2212522 0
epair  1500  02:81:38:75:d1:0a87264 0 0   
17853521 0
epair  1500  02:ad:f2:49:60:0a78055 0 0   
16025221 0
epair  1500  02:0d:07:81:b2:0a  1814108 0 0  
145588916 0


So all "default" interface names do not fit. I don't like the 
solution of "rename all your interfaces" as I think the 
out-of-the-box experience can be made better.
I'll vote for enabling -W by default and add an option for backwards 
compatibility.


-W makes the address field unnecessarily wide for most users, at least
if IPv6 is enabled.  However, netstat already has code to figure out
the required field width to avoid truncating names, which it uses only
for -W.  I’d suggest that this should be done unconditionally, as 
the
output makes no sense if names are ambiguous.  This is a trivial 
change

(I just tested it).  Any comments or objections?  I’ll put it in
review.


It’s https://reviews.freebsd.org/D35703.

Mike



Regards,
Ronald.



 >

I'm not saying "you can alias around this" is bad, but I sense we're
walking into a world which is where Linux is, with every physical
device called eth0/1/2 and then "which" device is eth0 becomes a
question..

On Fri, Jul 1, 2022 at 1:17 AM Rodney W. Grimes
 wrote:
>
> [ Charset UTF-8 unsupported, converting... ]
> > On 6/29/2022 10:56 AM, Chris Ross wrote:
> > > Hello folks. ?I just noticed something that I?m sure has been 
true
> > > forever, but I checked and it?s still true on my 12.3-STABLE 
system.

> > >
> > One of the first local mods I do is alias netstat to netstat -W 
for this

> > reason. e.g.
> > alias netstat?? netstat -W
> >
> > in /etc/csh.cshrc
>
> That only fixes it for your interactive csh processes, the
> original poster had specifically mentioned output from
> periodic scrips, aka daily iirc.
>
> One thing that can be done to mitigate the long vlan
> dev name (imho the vlan driver should of just named
> itself much short, like "vl", as most network devices
> are 2 litter names anyway) is to use the "name" option
> of ifconfig to give them a better name than the default.
>
> ifconfig vlan2 create vlandev em0 vlan 2 name v2
>
> --
> Rod Grimes 
rgri...@freebsd.org

>





experimental support for IPv4 unicast extensions

2022-07-06 Thread Mike Karels
I have been corresponding with the authors of Internet-Drafts that relax
restrictions on parts of the IPv4 address space to allow normal unicast
use, and I have FreeBSD changes to allow experimentation with these
updates.  This message summarizes my changes, and solicits input.

The changes are all controlled by sysctl, and default to "off".
The parts of the address space in question and the relevant changes:

0/8 (network 0) [1]: Restrictions on network 0 are lifted if the sysctl
net.inet.ip.allow_zeronet is set to 1.  This applies to packet forwarding
and ICMP echo.

224/4 (Experimental/"Class E") [2]: Restrictions on the Experimental
address class are lifted if the sysctl net.inet.ip.allow_experimental
is set to 1.  This applies to packet forwarding and ICMP echo.

127/8 (loopback net) [3]: The size of the reservation for the loopback
network can be reduced from 127/8 to 127.0/16 using the sysctl
net.int.ip.loopback_mask.  My current sysctl sets the mask, but that
is a little cumbersome; I should probably change the sysctl to allow
a mask length to be set.  This change is limited to the kernel; the
IN_LOOPBACK macro uses the current mask in the kernel, but the default
mask at user level.  Also, some user programs use IN_LOOPBACKNET along
with a Class A shift to crack this by hand.  The kernel change affects
IP packet input and output as well as forwarding.

The changes described above are all included in a single review for now,
although I would probably separate them before pushing them.  (They
necessarily collide though.)  The review is intended for comments only,
and is https://reviews.freebsd.org/D35741.  I think it makes sense to
put these changes in -current in order to enable experimentation, but
I wanted to open the subject for discussion first.

Changes are also being made in Linux, although I don't know their state.

Note that there is a related proposal and change to allow use of the
lowest host on a network/subnet [4].  This change was essentially a bug
fix for FreeBSD, and is already in -current and 13.1-RELEASE.

Mike

[1] https://datatracker.ietf.org/doc/draft-schoen-intarea-unicast-0/01/

[2] https://datatracker.ietf.org/doc/draft-schoen-intarea-unicast-240/

[3] https://datatracker.ietf.org/doc/draft-schoen-intarea-unicast-127/

[4] 
https://datatracker.ietf.org/doc/draft-schoen-intarea-unicast-lowest-address/



getaddrinfo error for existing host without requested address family

2022-09-27 Thread Mike Karels
I recently noticed the following behavior:

% ping6 redrock
ping6: Name does not resolve
% host redrock
redrock.karels.net has address 10.0.2.2
redrock.karels.net mail is handled by 10 mail.karels.net.
% ping6 nonexistenthost
ping6: Name does not resolve

The first error message is misleading, because the name *does* resolve,
but has no  record, and it is the same error message as for a name
that truly does not exist.  The problem comes from the set of error
codes that getaddrinfo() returns in these two cases.  The problem did
not exist with gethostbyname(), which has separate error codes for the
two (although gethostbyname did not have provision for IPv6, it handled
cases like domain names and mail domains without IPv4 addresses).

getaddrinfo() uses a richer set of error codes than gethostbyname(), but
still misses this case.  However, looking at , I see

#if 0
/* Obsoleted on RFC 2553bis-02 */
#define EAI_ADDRFAMILY   1  /* address family for hostname not supported */
#endif
...
#if 0
/* Obsoleted on RFC 2553bis-02 */
#define EAI_NODATA   7  /* no address associated with hostname */
#endif

I don't know why these two were omitted from the update to RFC 2553, but
the first seems to me to be the correct error for an existing name without
an address for the requested address family.  Also, that is the error
message produced by Linux (Ubuntu 22.04.1).

NetBSD and OpenBSD produce the second of these two errors for a host
without the requested address.  But they also produce the same error
when a name does not exist.

RFC 2553bis-02 has timed out, and is replaced by RFC 3493, which is also
missing EAI_ADDRFAMILY.  These are informational RFCs, not specifying an
Internet standard.

I propose re-enabling EAI_ADDRFAMILY and using it for the situation
where a name exists but does not have an address in the requested family.
This would make the error in the example less misleading, and would behave
the same as Linux in this regard.  The change to netdb.h is trivial, but
getaddrinfo() needs a little more work because it uses the NS_* errors
from  internally and then translates.  But it will benefit
from greater accuracy in other cases as well (e.g.  "out of memory"
rather than "Name does not resolve").

Comments?  I have a change in progress, but wanted to float the idea
before I finish it and put it into review.

Mike



Re: getaddrinfo error for existing host without requested address family

2022-09-27 Thread Mike Karels
On 27 Sep 2022, at 17:41, Viktor Dukhovni wrote:

> On Tue, Sep 27, 2022 at 03:53:12PM -0500, Mike Karels wrote:
>
>> The first error message is misleading, because the name *does* resolve,
>> but has no  record, and it is the same error message as for a name
>> that truly does not exist.
>
> FWIW, the distinction between NODATA and NXDOMAIN is these days not
> infrequently violated at the authoritative nameserver:
>
>   https://datatracker.ietf.org/doc/html/draft-valsorda-dnsop-black-lies-00
>
> So whether or not a name actually exists or just fails to have the
> requested record type is at times not easily determined. :-(

All getaddrinfo() can do is translate what the resolver receives:
- If there is an NXDOMAIN error, we should report that the name does not
resolve;
- If there is no error but no record of the requested type in the answer,
we should report that there is no address of the requested type.

If the server always uses NXDOMAIN, we’ll report as indicated for that
domain.  In my test case, I control the server :).

Mike
> -- 
> Viktor.



Re: getaddrinfo error for existing host without requested address family

2022-10-17 Thread Mike Karels
On Wed, 28 Sep 2022, Konstantin Belousov wrote:

> On Tue, Sep 27, 2022 at 03:53:12PM -0500, Mike Karels wrote:
> > I recently noticed the following behavior:
> > 
> > % ping6 redrock
> > ping6: Name does not resolve
> > % host redrock
> > redrock.karels.net has address 10.0.2.2
> > redrock.karels.net mail is handled by 10 mail.karels.net.
> > % ping6 nonexistenthost
> > ping6: Name does not resolve
> > 
> > The first error message is misleading, because the name *does* resolve,
> > but has no  record, and it is the same error message as for a name
> > that truly does not exist.  The problem comes from the set of error
> > codes that getaddrinfo() returns in these two cases.  The problem did
> > not exist with gethostbyname(), which has separate error codes for the
> > two (although gethostbyname did not have provision for IPv6, it handled
> > cases like domain names and mail domains without IPv4 addresses).
> > 
> > getaddrinfo() uses a richer set of error codes than gethostbyname(), but
> > still misses this case.  However, looking at , I see
> > 
> > #if 0
> > /* Obsoleted on RFC 2553bis-02 */
> > #define EAI_ADDRFAMILY   1  /* address family for hostname not 
> > supported */
> > #endif
> > ...
> > #if 0
> > /* Obsoleted on RFC 2553bis-02 */
> > #define EAI_NODATA   7  /* no address associated with hostname 
> > */
> > #endif
> > 
> > I don't know why these two were omitted from the update to RFC 2553, but
> > the first seems to me to be the correct error for an existing name without
> > an address for the requested address family.  Also, that is the error
> > message produced by Linux (Ubuntu 22.04.1).
> > 
> > NetBSD and OpenBSD produce the second of these two errors for a host
> > without the requested address.  But they also produce the same error
> > when a name does not exist.
> > 
> > RFC 2553bis-02 has timed out, and is replaced by RFC 3493, which is also
> > missing EAI_ADDRFAMILY.  These are informational RFCs, not specifying an
> > Internet standard.
> > 
> > I propose re-enabling EAI_ADDRFAMILY and using it for the situation
> > where a name exists but does not have an address in the requested family.
> > This would make the error in the example less misleading, and would behave
> > the same as Linux in this regard.  The change to netdb.h is trivial, but
> > getaddrinfo() needs a little more work because it uses the NS_* errors
> > from  internally and then translates.  But it will benefit
> > from greater accuracy in other cases as well (e.g.  "out of memory"
> > rather than "Name does not resolve").
> > 
> > Comments?  I have a change in progress, but wanted to float the idea
> > before I finish it and put it into review.

> Perhaps look there
> https://www.openwall.com/lists/libc-coord/2022/09/27/1

> You might want to participate in the thread, instead of me.

I participated in a short discussion on that list.  The TL;DR:

- Linux/glibc (Ubuntu at least) uses EAI_NODATA ("No address associated
with hostname") when a name is valid but does not have the requested
address family.  This is better than FreeBSD currently, as it is
distinguished from EAI_NONAME ("Name or service not known").  But it
implies that there is no address in any family.  (I showed an example
from ping6 above, but it turns out to be atypical.)

- The author of the musl C library for Linux plans to use EAI_NODATA as
well, but with a different error message.

- Linux also uses EAI_ADDRFAMILY, but only when a numeric address is in the
wrong family, e.g. telnet -6 127.0.0.1.

- POSIX, like the latest RFC, does not define EAI_NODATA or EAI_ADDRFAMILY.

- There were no other opinions expressed.

I see two choices for FreeBSD when there is no address in the requested
family.  One is to use EAI_NODATA, probably using a modified error message.
The has the main disadvantage that we have several NLS translations.  Also,
it is different than Linux.

The other choice is to use EAI_ADDRFAMILY ("Address family for hostname
not supported") as originally proposed.  The existing error message seems
reasonable for this case.

Any comments or votes?  I am inclined to use EAI_ADDRFAMILY as originally
proposed.

Mike



Re: getaddrinfo error for existing host without requested address family

2022-10-26 Thread Mike Karels
On Oct 17, I wrote:

> On Wed, 28 Sep 2022, Konstantin Belousov wrote:

> > On Tue, Sep 27, 2022 at 03:53:12PM -0500, Mike Karels wrote:
> > > I recently noticed the following behavior:
> > > 
> > > % ping6 redrock
> > > ping6: Name does not resolve
> > > % host redrock
> > > redrock.karels.net has address 10.0.2.2
> > > redrock.karels.net mail is handled by 10 mail.karels.net.
> > > % ping6 nonexistenthost
> > > ping6: Name does not resolve
> > > 
> > > The first error message is misleading, because the name *does* resolve,
> > > but has no  record, and it is the same error message as for a name
> > > that truly does not exist.  The problem comes from the set of error
> > > codes that getaddrinfo() returns in these two cases.  The problem did
> > > not exist with gethostbyname(), which has separate error codes for the
> > > two (although gethostbyname did not have provision for IPv6, it handled
> > > cases like domain names and mail domains without IPv4 addresses).
> > > 
> > > getaddrinfo() uses a richer set of error codes than gethostbyname(), but
> > > still misses this case.  However, looking at , I see
> > > 
> > > #if 0
> > > /* Obsoleted on RFC 2553bis-02 */
> > > #define   EAI_ADDRFAMILY   1  /* address family for hostname not 
> > > supported */
> > > #endif
> > > ...
> > > #if 0
> > > /* Obsoleted on RFC 2553bis-02 */
> > > #define   EAI_NODATA   7  /* no address associated with hostname 
> > > */
> > > #endif
> > > 
> > > I don't know why these two were omitted from the update to RFC 2553, but
> > > the first seems to me to be the correct error for an existing name without
> > > an address for the requested address family.  Also, that is the error
> > > message produced by Linux (Ubuntu 22.04.1).
> > > 
> > > NetBSD and OpenBSD produce the second of these two errors for a host
> > > without the requested address.  But they also produce the same error
> > > when a name does not exist.
> > > 
> > > RFC 2553bis-02 has timed out, and is replaced by RFC 3493, which is also
> > > missing EAI_ADDRFAMILY.  These are informational RFCs, not specifying an
> > > Internet standard.
> > > 
> > > I propose re-enabling EAI_ADDRFAMILY and using it for the situation
> > > where a name exists but does not have an address in the requested family.
> > > This would make the error in the example less misleading, and would behave
> > > the same as Linux in this regard.  The change to netdb.h is trivial, but
> > > getaddrinfo() needs a little more work because it uses the NS_* errors
> > > from  internally and then translates.  But it will benefit
> > > from greater accuracy in other cases as well (e.g.  "out of memory"
> > > rather than "Name does not resolve").
> > > 
> > > Comments?  I have a change in progress, but wanted to float the idea
> > > before I finish it and put it into review.

> > Perhaps look there
> > https://www.openwall.com/lists/libc-coord/2022/09/27/1

> > You might want to participate in the thread, instead of me.

> I participated in a short discussion on that list.  The TL;DR:

> - Linux/glibc (Ubuntu at least) uses EAI_NODATA ("No address associated
> with hostname") when a name is valid but does not have the requested
> address family.  This is better than FreeBSD currently, as it is
> distinguished from EAI_NONAME ("Name or service not known").  But it
> implies that there is no address in any family.  (I showed an example
> from ping6 above, but it turns out to be atypical.)

> - The author of the musl C library for Linux plans to use EAI_NODATA as
> well, but with a different error message.

> - Linux also uses EAI_ADDRFAMILY, but only when a numeric address is in the
> wrong family, e.g. telnet -6 127.0.0.1.

> - POSIX, like the latest RFC, does not define EAI_NODATA or EAI_ADDRFAMILY.

> - There were no other opinions expressed.

> I see two choices for FreeBSD when there is no address in the requested
> family.  One is to use EAI_NODATA, probably using a modified error message.
> The has the main disadvantage that we have several NLS translations.  Also,
> it is different than Linux.

> The other choice is to use EAI_ADDRFAMILY ("Address family for hostname
> not supported") as originally proposed.  The existing error message seems
> reasonable for this case.

> Any comments or votes?  I am inclined to use EAI_ADDRFAMILY as originally
> proposed.

I put up a review, https://reviews.freebsd.org/D37139, with these changes.
The changes should be submitted as several commits, as indicated in the
review.

Mike



Re: trpt(8) to be decomissioned

2022-11-03 Thread Mike Karels
On 3 Nov 2022, at 22:48, Gleb Smirnoff wrote:

>   Hi,
>
> trpt(8) is utility to pull TCP debugging data from the kernel
> in 4.2BSD. We still have it in the base, with corresponding
> TCPDEBUG option in the kernel and SO_DEBUG socket option.
>
> At the same time we have much more powerful debugging facilities
> for TCP, e.g. the Dtrace probing, the TCP black box logging and
> siftr.  These are the tools that modern developers use.
>
> Already touched this topic with rscheff@, tuexen@, rrs@ and jtl@.
> None of them new what trpt(8) is :) Looks like a good justification
> to me.

I have used trpt, but not for many years.  It was done before tcpdump
as well.  Its time has long since gone.

Mike
> -- 
> Gleb Smirnoff



Re: sshd doesn't disconnect for 30+ minutes after the TCP connection is closed ungracefully

2023-03-01 Thread Mike Karels
On 1 Mar 2023, at 5:36, Michael Gmelin wrote:

>> On 1. Mar 2023, at 11:35, Yuri  wrote:
>>
>> Windows system connects to FreeBSD through ssh and then this connection 
>> dies because of WiFi or VPN issues.
>>
>> FreeBSD still has the sshd process alive for this connection for 30+ minutes.
>>
>> TCP keepalive is enabled on the FreeBSD host:
>>
>> $ sysctl net.inet.tcp.always_keepalive
>> net.inet.tcp.always_keepalive: 1
>>
>> Shouldn't TCP keepalive kill this sshd process after 3-4 minutes because 
>> this connection isn't alive?
>>
>
> Keepalives start after net.inet.tcp.keepidle milliseconds (2h by default).

When this happens to me, I generally log into the server again and use write(1)
to send a message to that tty (a newline will do).  That probes the connection
and causes a reset, and the session gets cleaned up.  I use a longer keepidle
value for other reasons.

Mike



IPv6 LOR in main

2023-08-15 Thread Mike Karels
I have a machine running a recent main system, 765ad5b28d3f, just after
ALPHA1.  It hosts a VM on bhyve.  About the time I installed and configured
ALPHA1 on the guest, the host got this LOR from IPv6 (nd6_llinfo_timer):

lock order reversal:
 1st 0xf802ea7dce90 lle (lle, rw) @ netinet6/in6.c:2442
 2nd 0xfe0119c9a0b0 nd6 list (nd6 list, rw) @ netinet6/nd6_rtr.c:864
lock order nd6 list -> lle established at:
#0 0x80bc0eba at witness_checkorder+0x30a
#1 0x80b47895 at _rw_wlock_cookie+0x65
#2 0x80d9023e at nd6_llinfo_timer+0x9e
#3 0x80b6c0ce at softclock_call_cc+0x14e
#4 0x80b6d836 at softclock_thread+0xc6
#5 0x80b03792 at fork_exit+0x82
#6 0x8101e1ce at fork_trampoline+0xe
lock order lle -> nd6 list attempted at:
#0 0x80bc176e at witness_checkorder+0xbbe
#1 0x80b47895 at _rw_wlock_cookie+0x65
#2 0x80d969e1 at defrouter_remove+0x41
#3 0x80d9363b at nd6_na_input+0x9bb
#4 0x80d68157 at icmp6_input+0x9a7
#5 0x80d80867 at ip6_input+0xc97
#6 0x80ca3e3d at netisr_dispatch_src+0xad
#7 0x80c8689a at ether_demux+0x17a
#8 0x80c87f0f at ether_nh_input+0x39f
#9 0x80ca3e3d at netisr_dispatch_src+0xad
#10 0x80c86cf9 at ether_input+0xd9
#11 0x80c8d01d at tunwrite+0x51d
#12 0x809d73b3 at devfs_write_f+0xf3
#13 0x80bc6dc2 at dofilewrite+0x82
#14 0x80bc6cdc at sys_writev+0x6c
#15 0x8104b3a8 at amd64_syscall+0x138
#16 0x8101da7b at fast_syscall_common+0xf8

    Mike



Re: Very slow scp performance comparing to Linux

2023-08-28 Thread mike tancsa

On 8/28/2023 3:32 AM, Wei Hu wrote:

Hi,

When I was testing a new NIC, I found the single stream scp performance was 
almost 8 time slower than Linux on the RX side. Initially I thought it might be 
something with the NIC. But when I switched to sending the file on localhost, 
the numbers stay the same.


Just curious, how does iperf3 perform in comparison ?

    ---Mike




Re: Regression with pf or IPv6 on FreeBSD 14 with IPsec gif(4) tunnel

2023-09-15 Thread mike tancsa

On 9/15/2023 1:38 AM, Xin Li wrote:

On 2023-09-14 3:28 AM, Kristof Provost wrote:

On 14 Sep 2023, at 4:54, Xin Li wrote:

Hi!
And as a shoot to the dark, I tried again with IPsec (racoon) 
disabled, and the issue is gone.  My IPsec configuration is fairly 
common: 


I'm still comparing the code and reading the history of changes 
between stable/13 and stable/14 to see if there are something obvious, 
but more insights from others would be appreciated :)



[/me takes a in the dark]

maybe something like net.inet.ipsec.filtertunnel=1 is needed now ?

    ---Mike




Re: reviewers for if_smsc change?

2023-11-09 Thread Mike Karels
On 9 Nov 2023, at 4:30, Ronald Klop wrote:

> On 11/4/23 15:39, Ronald Klop wrote:
>> Hi,
>>
>> For issue 274092 [1] I'm looking for reviewers.
>>
>> A user on the ML had an issue that the MAC address was not assigned on some 
>> Raspberry PI compute modules.
>> I tried and succeeded in using the MAC address passed on from the firmware 
>> to the kernel.
>>
>> The review is in: https://reviews.freebsd.org/D42463
>>
>> I have a ports commit bit if possible I would like it to commit this myself 
>> as first steps in more in depth FreeBSD development.
>>
>> Regards,
>> Ronald.
>>
>>
>> [1] https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=274092 (if_smsc.c 
>> needs to use ether_gen_addr instead of read_random for more stable MAC 
>> address)
>>
>
>
> Hi,
>
> I got some comments on the phabricator review and updated the patch with it. 
> I think it is pretty clean now and it is tested and works.
> How should I go from here?
> Can somebody help me commit this (I am a ports committer) or commit this for 
> me?
> Do I need more approval and if yes, how can I get that?
>
> Regards,
> Ronald.

The change looks good to me now.  I think it would be good if someone who
works on USB looked at it too.  I am willing to approve the change; I think
you can commit it with approval.

Mike



Re: How to tell if a network interface was renamed (and from what)

2023-11-19 Thread Mike Karels
On 19 Nov 2023, at 8:34, Mina Galić wrote:

>>> FreeBSD currently does not preserve the old ( original ) name of
>>> interfaces if it is renamed ( either physical or cloned ones ).
>>> While there's an attempt https://reviews.freebsd.org/D28247
>>> to get the device name (physical
>>> ones) but it is not perfect and not completed.
>>>
>>> So may I ask why you need to know if a network interface was renamed ?
>>
>>
>> Just last week I found this quite a pain as well; once an interface has
>> been renamed, if it's not a pseudo-interface with an obvious group
>> there's no clear way, AFAICT, to determine which driver created it
>
> I think the main reason that we need to know if and from what an interface 
> has been renamed is if we need to know what driver we're working with.
>
> But given that a rename doesn't change — or even just *alias*
> the sysctl dev hierarchy, where a %driver is recorded, we can't
> track it back.
>
> (but again, that's just for physical devices, then again virtual devices 
> record what type of device they are in their group which
> is essentially the same thing)
>
> As soon as we have more than one interface with different drivers
> it's impossible to parse out what we're dealing with without
> parsing rc.conf, logs, or worse things I can't think of right now.

The kernel has a driver name for each interface, which looks like it
doesn't change currently in most cases.  There is a kernel accessor
function, but I don't think it is exported to user space now.  It could
be, though.  Would this be sufficient for your purposes?  There is also
a unit number, which could also be exported.

Mike



Re: How to tell if a network interface was renamed (and from what)

2023-11-19 Thread Mike Karels
On 19 Nov 2023, at 13:13, Mina Galić wrote:

> Hi Mike,
>
>> The kernel has a driver name for each interface, which looks like it
>> doesn't change currently in most cases. There is a kernel accessor
>> function, but I don't think it is exported to user space now. It could
>> be, though. Would this be sufficient for your purposes? There is also
>> a unit number, which could also be exported.
>
> As mentioned in my initial post, I'm happy to drop to C where alternatives 
> are infeasible, slow, or otherwise cumbersome, or just plain don't exist.
>
> Here's the code we use to determine boottime: 
> https://github.com/canonical/cloud-init/blob/5496745b394f9b7b9eaf57fd619330d484ce2da8/cloudinit/util.py#L2073-L2105

If I would design it right now, I'd add code to ifconfig to exercise the
new feature, and do something like this:

# ifconfig my-interfacename drivername
igb
#

Unit could be done similarly if needed, or ifconfig could have an
operand that caused both driver and unit to be printed, maybe as
two words (hopefully no spaces in driver names!).

Mike

>
> Mina



Re: How to tell if a network interface was renamed (and from what)

2023-11-20 Thread Mike Karels
On 19 Nov 2023, at 15:35, Mina Galić wrote:

> Hi Zhenlei,
>
>
>> Since it is just for physical devices, may I propose to have the driver name 
>> in their groups ?
>>
>> So an if_ure interface ue0 will look like:
>>
>> ```
>> ue0: flags=1008843 metric 0 
>> mtu 1500
>>
>> options=60009b
>>
>> ether 00:e0:4c:xx:xx:xx
>> media: Ethernet autoselect (1000baseT )
>>
>> status: active
>> +++ groups: ure
>> nd6 options=23
>>
>> ```
>>
>> That does not include the unit number. But could be useful to quickly get 
>> the driver name of physical devices.
>>
>
> Given that currently on FreeBSD the easiest way to tell if something
> is a physical device is by checking the *absence* of groups, this
> would only really be acceptable if we add an "egress" group like
> OpenBSD does, in addition to the driver name.
>
> If we can't do that, then I think Mike's solution with having the
> driver (and unit) as a separate category would be preferable.

I have a proof of concept that makes the presumed original name
(driver name + unit number) available to ifconfig, which prints
the string with everything else in the standard output format.
I don't think that is the right solution, but the other details
should be easy.  I'm tempted to print the driver name and unit
number separately, although possibly as two words using the same
option.  It should probably be an option rather than a keyword,
so something like this:

# ifconfig -N interface-name
igb 1
#

Or the unit number could be on a separate option.

Comments?

Mike

> Unrelatedly, I don't see anything in ure(4) mentioning that if_ure
> devices will be named "ue".
> Don't we usually  document such deviation from the norm?
>
>
> Kind regards,
>
> Mina



Re: How to tell if a network interface was renamed (and from what)

2023-11-20 Thread Mike Karels
On 20 Nov 2023, at 14:56, Kristof Provost wrote:

> On 20 Nov 2023, at 21:29, Mike Karels wrote:
>> On 19 Nov 2023, at 15:35, Mina Galić wrote:
>>> Hi Zhenlei,
>>>
>>>
>>>> Since it is just for physical devices, may I propose to have the driver 
>>>> name in their groups ?
>>>>
>>>> So an if_ure interface ue0 will look like:
>>>>
>>>> ```
>>>> ue0: flags=1008843 metric 
>>>> 0 mtu 1500
>>>>
>>>> options=60009b
>>>>
>>>> ether 00:e0:4c:xx:xx:xx
>>>> media: Ethernet autoselect (1000baseT )
>>>>
>>>> status: active
>>>> +++ groups: ure
>>>> nd6 options=23
>>>>
>>>> ```
>>>>
>>>> That does not include the unit number. But could be useful to quickly get 
>>>> the driver name of physical devices.
>>>>
>>>
>>> Given that currently on FreeBSD the easiest way to tell if something
>>> is a physical device is by checking the *absence* of groups, this
>>> would only really be acceptable if we add an "egress" group like
>>> OpenBSD does, in addition to the driver name.
>>>
>>> If we can't do that, then I think Mike's solution with having the
>>> driver (and unit) as a separate category would be preferable.
>>
>> I have a proof of concept that makes the presumed original name
>> (driver name + unit number) available to ifconfig, which prints
>> the string with everything else in the standard output format.
>> I don't think that is the right solution,
>
> I believe a similar solution has been proposed before, and it failed to cope 
> with things like epair interfaces.

Hmm, epair certainly breaks the rules (well, conventions).  My current
code sees "epair0" for both halves, which may not be what is desired.
epair does some of this by hand, so could have more custom code.

> I’d look in the direction of just adding a field to struct ifnet with the 
> original interface name (likely easily done in if_attach()), along with a new 
> ioctl to retrieve that field.

That may be as good as we can do, although I'm working with netlink
rather than ioctls.

But see the next message...

Mike

> Best regards,
> Kristof



Re: How to tell if a network interface was renamed (and from what)

2023-11-20 Thread Mike Karels
On 20 Nov 2023, at 15:16, Franco Fichtner wrote:

>> On 20. Nov 2023, at 21:56, Kristof Provost  wrote:
>>
>> I’d look in the direction of just adding a field to struct ifnet with the 
>> original interface name (likely easily done in if_attach()), along with a 
>> new ioctl to retrieve that field.
>
> ifconfig_get_orig_name() already exists, but apart from wlandebug
> nothing is using it.

Thanks for pointing that out!  I hadn't noticed it.  I also hadn't thought
of that way to fetch the driver name and unit.

> The internally used IFDATA_DRIVERNAME also appears in ifinfo
> (not installed in base) and bsnmpd but that's it.
>
> if_dname is the target and it exists in ifnet struct along with
> a man page entry in inet(9).
>
> All that is really missing is a way to print it via ifconfig command.

That is trivial to add; I just tested it.  It also has problems with
epair.  Maybe that isn't an issue for this purpose.  I hate to invent
something new when there is something already existing that solves
most of the problem.

Mike

> Cheers,
> Franco



Re: How to tell if a network interface was renamed (and from what)

2023-11-21 Thread Mike Karels
On 21 Nov 2023, at 0:43, Franco Fichtner wrote:

>> On 20. Nov 2023, at 23:06, Mike Karels  wrote:
>>
>> On 20 Nov 2023, at 15:16, Franco Fichtner wrote:
>>
>>> All that is really missing is a way to print it via ifconfig command.
>>
>> That is trivial to add; I just tested it.  It also has problems with
>> epair.  Maybe that isn't an issue for this purpose.
>
> Two things to consider:
>
> Does epair do it the "right way"?  And does it even matter given that this
> behaviour hasn't had any exposure and is likely ever going to be used
> as input for another tool?

epair arguably does it wrong; it doesn't follow the convention of using the
driver name followed by an integer.  However, what it does is practical, and
clearly we don't want to change it now.  Using the driver name as returned
by ifconfig_get_orig_addr displays "epair0" for two interfaces.  If they have
been renamed, the new names will hopefully be suggestive of which is which.

Mina, do you care about epair, or is the behavior I described sufficient
for your purposes?

Mike

> I mean it still tracks the origin in the driver.  This way you can even find
> the epair belonging together.  It looks like it should given the design
> choices.
>
>
> Cheers,
> Franco



Re: How to tell if a network interface was renamed (and from what)

2023-11-21 Thread Mike Karels
On 21 Nov 2023, at 12:16, Mina Galić wrote:

> Hi Mike,
>
>> Mina, do you care about epair, or is the behavior I described sufficient
>> for your purposes?
>
> I do deeply care about epair, but for me, ifinfo does the
> right thing for me:
>
> root@irc:~ # ifinfo | grep Interface
> Interface vnet0 (epair30):
> Interface lo0 (lo0):

That's not the whole story for epair.  For example, I get this:

Interface bhyve (vtnet0):
Interface lo0 (lo0):
Interface lo1 (lo1):
Interface foo0a (epair0):
Interface foo0b (epair0):

But if you only need the driver name, it will do.

A problem with ifinfo is that it is not normally installed.
I think it is worth adding a similar feature for the driver name
to ifconfig.

> for one. For the other. For the main purpose, figuring out
> in cloud-init what the driver of an interface is / if the
> interface has been renamed, this is more than sufficient.

Currently I have this for ifconfig -D:

bhyve: flags=1008843 metric 0 
mtu 1500
options=80028
ether 58:9c:fc:0b:0c:10
inet 10.0.3.1 netmask 0xff00 broadcast 10.0.3.255
inet6 fe80::5a9c:fcff:fe0b:c10%bhyve prefixlen 64 scopeid 0x1
inet6 2001:470:c202:3::1 prefixlen 64
media: Ethernet autoselect (10Gbase-T )
status: active
nd6 options=21
drivername: vtnet0

It's a little more work to parse, but I decided that it was useful
for humans, and works with -a.

Mike
>
> Thank you very much,
>
> Mina



Re: How to tell if a network interface was renamed (and from what)

2023-11-21 Thread Mike Karels
On 21 Nov 2023, at 19:35, Jamie Landeg-Jones wrote:

> Mike Karels  wrote:
>
>> I have a proof of concept that makes the presumed original name
>> (driver name + unit number) available to ifconfig, which prints
>> the string with everything else in the standard output format.
>> I don't think that is the right solution, but the other details
>> should be easy.  I'm tempted to print the driver name and unit
>> number separately, although possibly as two words using the same
>> option.  It should probably be an option rather than a keyword,
>> so something like this:
>>
>> # ifconfig -N interface-name
>> igb 1
>> #
>>
>> Or the unit number could be on a separate option.
>>
>> Comments?
>
> I prefer that idea to the use of "groups" (which can be modified anyway),
> though in your example, that should be documented as "driver name" not
> "interface name", seeing that the returned value is not actually now
> the interface name!

Here, interface-name was a placeholder for the current name of the
interface.  My newer code prints "drivername: igb1" at the end of
the list of values for "ifconfig $interface-name" or "ifconfig -a"
if the -D option is given.  I decided that it was better to be able
to get all the translations at once (for humans, anyway).

Mike

> (Similary, if a keyword is decided upon, it should be "driver-name"
> not "interface-name")
>
> I realise that writing "interface-name" was probably just muscle-memory,
> but just wanted clarification.
>
> No opinion on where to display the unit number - whatever works out
> better for you.
>
> Cheers, Jamie



Re: How to tell if a network interface was renamed (and from what)

2023-11-22 Thread Mike Karels
On 21 Nov 2023, at 19:57, Mike Karels wrote:

> ...  My newer code prints "drivername: igb1" at the end of
> the list of values for "ifconfig $interface-name" or "ifconfig -a"
> if the -D option is given.  I decided that it was better to be able
> to get all the translations at once (for humans, anyway).
>

See https://reviews.freebsd.org/D42721 for the ifconfig change.

Mike



Re: Request for Testing: TCP RACK

2024-03-18 Thread Mike Karels
On 18 Mar 2024, at 7:04, tue...@freebsd.org wrote:

>> On 18. Mar 2024, at 12:42, Nuno Teixeira  wrote:
>>
>> Hello all!
>>
>> It works just fine!
>> System performance is OK.
>> Using patch on main-n268841-b0aaf8beb126(-dirty).
>>
>> ---
>> net.inet.tcp.functions_available:
>> Stack   D AliasPCB count
>> freebsd   freebsd  0
>> rack* rack 38
>> ---
>>
>> It would be so nice that we can have a sysctl tunnable for this patch
>> so we could do more tests without recompiling kernel.
> Thanks for testing!
>
> @gallatin: can you come up with a patch that is acceptable for Netflix
> and allows to mitigate the performance regression.

Ideally, tcphpts could enable this automatically when it starts to be
used (enough?), but a sysctl could select auto/on/off.

Mike

> Best regards
> Michael
>>
>> Thanks all!
>> Really happy here :)
>>
>> Cheers,
>>
>> Nuno Teixeira  escreveu (domingo, 17/03/2024 à(s) 
>> 20:26):
>>>
>>> Hello,
>>>
>>>> I don't have the full context, but it seems like the complaint is a 
>>>> performance regression in bonnie++ and perhaps other things when tcp_hpts 
>>>> is loaded, even when it is not used.  Is that correct?
>>>>
>>>> If so, I suspect its because we drive the tcp_hpts_softclock() routine 
>>>> from userret(), in order to avoid tons of timer interrupts and context 
>>>> switches.  To test this theory,  you could apply a patch like:
>>>
>>> It's affecting overall system performance, bonnie was just a way to
>>> get some numbers to compare.
>>>
>>> Tomorrow I will test patch.
>>>
>>> Thanks!
>>>
>>> --
>>> Nuno Teixeira
>>> FreeBSD Committer (ports)
>>
>>
>>
>> -- 
>> Nuno Teixeira
>> FreeBSD Committer (ports)



Re: Source IPv4 address selection vs BGP IX connection

2024-04-24 Thread mike tancsa

On 4/23/2024 10:12 PM, Gregory Shapiro wrote:

Short version:

Using FreeBSD as a BGP router has network issues caused by suboptimal
default IPv4 source address selection when connected to Internet
Exchanges (which are required to use IPs that aren't routable on the
Internet).  I was hoping to find more elegant workarounds or encourage
FreeBSD to add source IPv4 selection akin to the existing IPv6 source
address selection (no_prefer_iface and prefer_source).

I assume that there is a group of BGP enthusiasts using FreeBSD lurking
on freebsd-net.  What have you done to solve this problem?

For DNS in such situations I start unbound locally and bind it to an 
internal interface or an IP on lo0 and then tell unbound to just use 
that IP only  (outgoing-interface IIRC) that is advertised out as a work 
around.  Its not a proper solution, but will get your resolver working 
at least. I run into this problem in layered networks where the next hop 
is often RFC 1918 addrs. I bind applications to internal NICs that have 
addresses that have routing to/from.


    ---Mike


Re: Source IPv4 address selection vs BGP IX connection

2024-04-26 Thread Mike Karels
On 25 Apr 2024, at 15:56, Gregory Shapiro wrote:

>> of course, gethostid(3) is now deprecated in favour of sysctl(3), and the
>> hostid(8) command is gone, and there's now more than one flavour of
>> Internet-capable UNIX in the world, and there's more than one Internet
>> address family now. so what i did in 1990 is a guide only inasmuch as some
>> way should exist to change the default local address of a socket so that it
>> isn't the address of the interface used for the destination. if that happens
>> i hope we coordinate with Linux and with the other BSD's.
>
> Linux already has a model to give a hint for source address selection via
> route table "hints".  When adding routes (either manually via `ip route'
> or via things like bird2 BGP daemon), Linux supports setting a source IP
> for when that route is used.
>
> Interestingly, JunOS (which I believe is based on FreeBSD) also supports
> a way to specify a default IPv4 source address, preferring the primary address
> on lo0 that is not 127.0.0.1.  It is a common practice for BGP systems to
> attach their announced IPs to the loopback interface.
>
> https://www.juniper.net/documentation/us/en/software/junos/cli-reference/topics/ref/statement/default-address-selection-edit-system.html
>
> For the Linux and bird (BGP) documentation:
>
> Linux
> -
> http://linux-ip.net/html/tools-ip-route.html#ex-tools-ip-route-add-src
>
> "The src option provides a hint to the kernel for source address selection. 
> When you are working with multiple routing tables and different classes of 
> traffic, you can ease your administrative burden, by hosting several 
> different IPs on your linux machine and setting the source address 
> differently, depending on the type of traffic.
>
> In the example below, let's assume that our masquerading host also runs a DNS 
> resolver for the internal network and we have selected all of the outbound 
> DNS packets to be routed according to table 7 [53]. Now, any packet which 
> originates on this box (or is masqueraded through this table) will have its 
> source IP set to 205.254.211.198.
>
> Example D.19. Using src in a routing command with route add
>
> [root@masq-gw]# ip route add default via 205.254.211.254 src 205.254.211.198 
> table 7
> "
>
> man ip-route
>
> "src ADDRESS
>   the source address to prefer when sending to the
>   destinations covered by the route prefix."

When you first asked this question, my first thought was that this should
be in the routing table.  It seems to me that choosing the source address
is more a function of the destination than of the process (vnet, jail,
etc).  In fact, this problem seemed familiar, so I went looking.  It turns
out that this feature has been available since 4.4BSD.

route(8) has a keyword to do just this, -ifa (interface address).  It only
seems to work when the alias is on the same interface.  It also seems to
be broken in -current and 14.0, but I got it to work with 13.3 and 12.4.
While experimenting, I tried to use -ifp as well, but it seems to be ignored;
route add -ifp foobar ... does not fail.  (12.4 got the interface wrong
when the alias was on the loopback.)

Anyone know why -ifa is ineffective in 14.0 and -current?  It could
be fallout from netlink.

The documentation is weak at best; route(8) says only "the -ifp or -ifa
modifiers may be used to determine the interface or interface address".
"route get" does not display the ifa; I think it did at one time.

I'll also note that binding the desired source address manually works;
ping -S uses this.

Mike

>
> Bird (BGP Daemon)
> 
> "The Kernel protocol defines several attributes. These attributes are 
> translated to appropriate system (and OS-specific) route attributes. We 
> support these attributes:
> ..
> ip krt_prefsrc
> (Linux) The preferred source address. Used in source address selection for 
> outgoing packets. Has to be one of the IP addresses of the router."



Re: Question about netinet6/in6.h

2024-04-26 Thread Mike Karels
On 26 Apr 2024, at 15:01, Warner Losh wrote:

> This has to be a FAQ
>
> I'm porting a program from Linux, I often see an error like:
> ./test/mock-ifaddrs.c:95:19: error: no member named 's6_addr32' in 'struct
> in6_addr'
>95 | ipv6->sin6_addr.s6_addr32[3] = 0;
>   | ~~~ ^
> but yet, we kinda define them, but only for the kernel and boot loader:
> /*
>  * IPv6 address
>  */
> struct in6_addr {
> union {
> uint8_t __u6_addr8[16];
> uint16_t__u6_addr16[8];
> uint32_t__u6_addr32[4];
> } __u6_addr;/* 128-bit IP6 address */
> };
>
> #define s6_addr   __u6_addr.__u6_addr8
> #if defined(_KERNEL) || defined(_STANDALONE) /* XXX nonstandard */
> #define s6_addr8  __u6_addr.__u6_addr8
> #define s6_addr16 __u6_addr.__u6_addr16
> #define s6_addr32 __u6_addr.__u6_addr32
> #endif
>
> I'm wondering if anybody why it's like that? git blame suggests we imported
> that from kame, with
> only tweaks by people that are now deceased*.*
>
> Why not just expose them?

Looks like only s6_addr is specified in the RFCs (2553 and 3493).  Oddly,
though, the RFCs give an example implementation using that union with
different element names (like _S6_u8), and show the one #define.
Similarly, POSIX specifies only s6_addr, but it allows other members
of the structure, so I don't see a problem with exposing them all even
in a POSIX environment.

I would have no objection to exposing all four definitions, especially
if Linux apps use them.

Mike



Re: Question about netinet6/in6.h

2024-04-26 Thread Mike Karels
On 26 Apr 2024, at 15:49, Mike Karels wrote:

> On 26 Apr 2024, at 15:01, Warner Losh wrote:
>
>> This has to be a FAQ
>>
>> I'm porting a program from Linux, I often see an error like:
>> ./test/mock-ifaddrs.c:95:19: error: no member named 's6_addr32' in 'struct
>> in6_addr'
>>95 | ipv6->sin6_addr.s6_addr32[3] = 0;
>>   | ~~~ ^
>> but yet, we kinda define them, but only for the kernel and boot loader:
>> /*
>>  * IPv6 address
>>  */
>> struct in6_addr {
>> union {
>> uint8_t __u6_addr8[16];
>> uint16_t__u6_addr16[8];
>> uint32_t__u6_addr32[4];
>> } __u6_addr;/* 128-bit IP6 address */
>> };
>>
>> #define s6_addr   __u6_addr.__u6_addr8
>> #if defined(_KERNEL) || defined(_STANDALONE) /* XXX nonstandard */
>> #define s6_addr8  __u6_addr.__u6_addr8
>> #define s6_addr16 __u6_addr.__u6_addr16
>> #define s6_addr32 __u6_addr.__u6_addr32
>> #endif
>>
>> I'm wondering if anybody why it's like that? git blame suggests we imported
>> that from kame, with
>> only tweaks by people that are now deceased*.*
>>
>> Why not just expose them?
>
> Looks like only s6_addr is specified in the RFCs (2553 and 3493).  Oddly,
> though, the RFCs give an example implementation using that union with
> different element names (like _S6_u8), and show the one #define.
> Similarly, POSIX specifies only s6_addr, but it allows other members
> of the structure, so I don't see a problem with exposing them all even
> in a POSIX environment.
>
> I would have no objection to exposing all four definitions, especially
> if Linux apps use them.

I put the change, along with an explanatory comment, in
https://reviews.freebsd.org/D44979.  Comments welcome.

Mike



Re: Question about netinet6/in6.h

2024-04-26 Thread Mike Karels
On 26 Apr 2024, at 18:06, Warner Losh wrote:

> On Fri, Apr 26, 2024 at 4:21 PM Mike Karels  wrote:
>
>> On 26 Apr 2024, at 15:49, Mike Karels wrote:
>>
>>> On 26 Apr 2024, at 15:01, Warner Losh wrote:
>>>
>>>> This has to be a FAQ
>>>>
>>>> I'm porting a program from Linux, I often see an error like:
>>>> ./test/mock-ifaddrs.c:95:19: error: no member named 's6_addr32' in
>> 'struct
>>>> in6_addr'
>>>>95 | ipv6->sin6_addr.s6_addr32[3] = 0;
>>>>   | ~~~ ^
>>>> but yet, we kinda define them, but only for the kernel and boot loader:
>>>> /*
>>>>  * IPv6 address
>>>>  */
>>>> struct in6_addr {
>>>> union {
>>>> uint8_t __u6_addr8[16];
>>>> uint16_t__u6_addr16[8];
>>>> uint32_t__u6_addr32[4];
>>>> } __u6_addr;/* 128-bit IP6 address */
>>>> };
>>>>
>>>> #define s6_addr   __u6_addr.__u6_addr8
>>>> #if defined(_KERNEL) || defined(_STANDALONE) /* XXX nonstandard */
>>>> #define s6_addr8  __u6_addr.__u6_addr8
>>>> #define s6_addr16 __u6_addr.__u6_addr16
>>>> #define s6_addr32 __u6_addr.__u6_addr32
>>>> #endif
>>>>
>>>> I'm wondering if anybody why it's like that? git blame suggests we
>> imported
>>>> that from kame, with
>>>> only tweaks by people that are now deceased*.*
>>>>
>>>> Why not just expose them?
>>>
>>> Looks like only s6_addr is specified in the RFCs (2553 and 3493).  Oddly,
>>> though, the RFCs give an example implementation using that union with
>>> different element names (like _S6_u8), and show the one #define.
>>> Similarly, POSIX specifies only s6_addr, but it allows other members
>>> of the structure, so I don't see a problem with exposing them all even
>>> in a POSIX environment.
>>>
>>> I would have no objection to exposing all four definitions, especially
>>> if Linux apps use them.
>>
>> I put the change, along with an explanatory comment, in
>> https://reviews.freebsd.org/D44979.  Comments welcome.
>>
>
> Thanks! I was testing a similar change, but I like yours better... though
> maybe
> we should just make it visible when __BSD_VISIBLE is true I'll have to
> look
> closely at what Linux does here... I think they have it always visible, or
> at least
> musl does that (glibc is harder to track down due to the many layers of
> indirection).

I thought briefly about __BSD_VISIBLE, but wasn't sure it was necessary.
Let me know what you find out.  I think it should work either way; in.h
includes cdefs.h, so it's guaranteed to have been included.

Mike



Re: Question about netinet6/in6.h

2024-04-27 Thread Mike Karels
On 26 Apr 2024, at 23:02, Bakul Shah wrote:

> On Apr 26, 2024, at 8:41 PM, Warner Losh  wrote:
>>
>>
>>
>> On Fri, Apr 26, 2024, 9:33 PM Bakul Shah  wrote:
>>
>>
>>> On Apr 26, 2024, at 5:02 PM, Mike Karels  wrote:
>>>
>>> On 26 Apr 2024, at 18:06, Warner Losh wrote:
>>>
>>>> On Fri, Apr 26, 2024 at 4:21 PM Mike Karels  wrote:
>>>>
>>>>> On 26 Apr 2024, at 15:49, Mike Karels wrote:
>>>>>
>>>>>> On 26 Apr 2024, at 15:01, Warner Losh wrote:
>>>>>>
>>>>>>> This has to be a FAQ
>>>>>>>
>>>>>>> I'm porting a program from Linux, I often see an error like:
>>>>>>> ./test/mock-ifaddrs.c:95:19: error: no member named 's6_addr32' in
>>>>> 'struct
>>>>>>> in6_addr'
>>>>>>>   95 | ipv6->sin6_addr.s6_addr32[3] = 0;
>>>>>>>  | ~~~ ^
>>>>>>> but yet, we kinda define them, but only for the kernel and boot loader:
>>>>>>> /*
>>>>>>> * IPv6 address
>>>>>>> */
>>>>>>> struct in6_addr {
>>>>>>>union {
>>>>>>>uint8_t __u6_addr8[16];
>>>>>>>uint16_t__u6_addr16[8];
>>>>>>>uint32_t__u6_addr32[4];
>>>>>>>} __u6_addr;/* 128-bit IP6 address */
>>>>>>> };
>>>>>>>
>>>>>>> #define s6_addr   __u6_addr.__u6_addr8
>>>>>>> #if defined(_KERNEL) || defined(_STANDALONE) /* XXX nonstandard */
>>>>>>> #define s6_addr8  __u6_addr.__u6_addr8
>>>>>>> #define s6_addr16 __u6_addr.__u6_addr16
>>>>>>> #define s6_addr32 __u6_addr.__u6_addr32
>>>>>>> #endif
>>>>>>>
>>>>>>> I'm wondering if anybody why it's like that? git blame suggests we
>>>>> imported
>>>>>>> that from kame, with
>>>>>>> only tweaks by people that are now deceased*.*
>>>>>>>
>>>>>>> Why not just expose them?
>>>>>>
>>>>>> Looks like only s6_addr is specified in the RFCs (2553 and 3493).  Oddly,
>>>>>> though, the RFCs give an example implementation using that union with
>>>>>> different element names (like _S6_u8), and show the one #define.
>>>>>> Similarly, POSIX specifies only s6_addr, but it allows other members
>>>>>> of the structure, so I don't see a problem with exposing them all even
>>>>>> in a POSIX environment.
>>>>>>
>>>>>> I would have no objection to exposing all four definitions, especially
>>>>>> if Linux apps use them.
>>>>>
>>>>> I put the change, along with an explanatory comment, in
>>>>> https://reviews.freebsd.org/D44979.  Comments welcome.
>>>>>
>>>>
>>>> Thanks! I was testing a similar change, but I like yours better... though
>>>> maybe
>>>> we should just make it visible when __BSD_VISIBLE is true I'll have to
>>>> look
>>>> closely at what Linux does here... I think they have it always visible, or
>>>> at least
>>>> musl does that (glibc is harder to track down due to the many layers of
>>>> indirection).
>>>
>>> I thought briefly about __BSD_VISIBLE, but wasn't sure it was necessary.
>>> Let me know what you find out.  I think it should work either way; in.h
>>> includes cdefs.h, so it's guaranteed to have been included.
>>
>> If the -ms-extensions option is used with gcc or clang, this ugliness can
>> go away as you can have nested anonymous unions or -structs and their fields
>> can be referenced as if they're directly in the parent struct/union.
>>
>> [IIRC this was present in Plan9 C from very early on. Also in C11 or later]
>>
>> True. In fact c11 and newer doesn't need anything on the command line here. 
>> If it were only in the kernel then I'd chamge it like thay while I was 
>> here... but lots of code in ports will specify c99 + POSIX 2001 and to 
>> compile there your only hope is this construct
>
> Such defines were typically within #if defined(KERNEL) .. #endif
> so non-kld ports shouldn't be referring to them, right?!

I don't know if that is typical, but in this case the point is to make it
visible to user level.  We don't expect base/ports to do that currently,
but imported programs will.

Mike



networking in 14.1 release notes

2024-05-18 Thread Mike Karels
I have no networking changes at all in the 14.1 release notes.  Is there
anything that should be mentioned?  Feel free to reply to me individually.

Thanks,
Mike



Re: networking in 14.1 release notes

2024-05-19 Thread mike tancsa

On 5/18/2024 10:49 AM, Mike Karels wrote:

I have no networking changes at all in the 14.1 release notes.  Is there
anything that should be mentioned?  Feel free to reply to me individually.

Not sure if appropriate or not, but when going to 13.x to 14.x, not all 
vlan configs work now in rc.conf


Both

ifconfig_vlan2="192.168.1.51/24 vlandev igb1 vlan 2"
ifconfig_vlan2="192.168.1.51/24 vlan 2 vlandev igb1"

used to work on RELENG_13

now only

ifconfig_vlan2="192.168.1.51/24  vlan 2 vlandev igb1"

is allowed.  Maybe a heads up in UPDATING ?

    ---Mike





Re: networking in 14.1 release notes

2024-05-19 Thread Mike Karels
On 19 May 2024, at 18:29, mike tancsa wrote:

> On 5/18/2024 10:49 AM, Mike Karels wrote:
>> I have no networking changes at all in the 14.1 release notes.  Is there
>> anything that should be mentioned?  Feel free to reply to me individually.
>>
> Not sure if appropriate or not, but when going to 13.x to 14.x, not all vlan 
> configs work now in rc.conf
>
> Both
>
> ifconfig_vlan2="192.168.1.51/24 vlandev igb1 vlan 2"
> ifconfig_vlan2="192.168.1.51/24 vlan 2 vlandev igb1"
>
> used to work on RELENG_13
>
> now only
>
> ifconfig_vlan2="192.168.1.51/24  vlan 2 vlandev igb1"
>
> is allowed.  Maybe a heads up in UPDATING ?

That sounds like an outright bug.  Looks like it was true in 14.0 as well.
Is there a bug report?  I couldn't find one.

btw, UPDATING is meant for upgrades from source.

Mike



Re: networking in 14.1 release notes

2024-05-20 Thread mike tancsa

On 5/19/2024 8:59 PM, Mike Karels wrote:

On 19 May 2024, at 18:29, mike tancsa wrote:


On 5/18/2024 10:49 AM, Mike Karels wrote:

I have no networking changes at all in the 14.1 release notes.  Is there
anything that should be mentioned?  Feel free to reply to me individually.


Not sure if appropriate or not, but when going to 13.x to 14.x, not all vlan 
configs work now in rc.conf

Both

ifconfig_vlan2="192.168.1.51/24 vlandev igb1 vlan 2"
ifconfig_vlan2="192.168.1.51/24 vlan 2 vlandev igb1"

used to work on RELENG_13

now only

ifconfig_vlan2="192.168.1.51/24  vlan 2 vlandev igb1"

is allowed.  Maybe a heads up in UPDATING ?

That sounds like an outright bug.  Looks like it was true in 14.0 as well.
Is there a bug report?  I couldn't find one.


I didnt open one. Wasnt sure if it the change was a deliberate one or on 
the wrong side of POLA. To me it feels unnecessary to have only one 
order of params but I might be missing the rational behind it. Shall I 
open a PR ?



    --Mike


btw, UPDATING is meant for upgrades from source.

Mike





Re: networking in 14.1 release notes

2024-05-20 Thread Mike Karels
On 20 May 2024, at 10:15, mike tancsa wrote:

> On 5/19/2024 8:59 PM, Mike Karels wrote:
>> On 19 May 2024, at 18:29, mike tancsa wrote:
>>
>>> On 5/18/2024 10:49 AM, Mike Karels wrote:
>>>> I have no networking changes at all in the 14.1 release notes.  Is there
>>>> anything that should be mentioned?  Feel free to reply to me individually.
>>>>
>>> Not sure if appropriate or not, but when going to 13.x to 14.x, not all 
>>> vlan configs work now in rc.conf
>>>
>>> Both
>>>
>>> ifconfig_vlan2="192.168.1.51/24 vlandev igb1 vlan 2"
>>> ifconfig_vlan2="192.168.1.51/24 vlan 2 vlandev igb1"
>>>
>>> used to work on RELENG_13
>>>
>>> now only
>>>
>>> ifconfig_vlan2="192.168.1.51/24  vlan 2 vlandev igb1"
>>>
>>> is allowed.  Maybe a heads up in UPDATING ?
>> That sounds like an outright bug.  Looks like it was true in 14.0 as well.
>> Is there a bug report?  I couldn't find one.
>
> I didnt open one. Wasnt sure if it the change was a deliberate one or on the 
> wrong side of POLA. To me it feels unnecessary to have only one order of 
> params but I might be missing the rational behind it. Shall I open a PR ?

Yes, please.  I looked into it yesterday; it looks accidental.  I'll follow up.

Mike
>
>> btw, UPDATING is meant for upgrades from source.
>>
>>  Mike
>>



Re: networking in 14.1 release notes

2024-05-20 Thread mike tancsa

On 5/20/2024 11:54 AM, Mike Karels wrote:

That sounds like an outright bug. Looks like it was true in 14.0 as well.

Is there a bug report?  I couldn't find one.

I didnt open one. Wasnt sure if it the change was a deliberate one or on the 
wrong side of POLA. To me it feels unnecessary to have only one order of params 
but I might be missing the rational behind it. Shall I open a PR ?

Yes, please.  I looked into it yesterday; it looks accidental.  I'll follow up.


Thanks Mike!  PR opened 
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=279181


    ---Mike



dropping udp fragments with ipfw

2024-08-29 Thread mike tancsa
I was working on some firewall rules to drop large UDP fragment attacks 
and noticed there is no easy way to drop fragments based on port ? e.g. 
if someone sends a UDP packet of 1400 bytes, I can drop it with


TARGET=192.168.1.1

ipfw add 5 deny log udp from any 53 to $TARGET

But if that packet is say 2000 bytes and is fragmented, the fragment 
passes through. I have to add a subsequent rule


ipfw add 10 deny log udp from any to $TARGET fragment

But this would kill all UDP fragments.  If the host has some other UDP 
application that needs to deal with fragmented packets, is there a way 
to get around that and only drop packets with a certain port in the 
first fragment ?


    ---Mike




Re: dropping udp fragments with ipfw

2024-08-29 Thread mike tancsa

On 8/29/2024 3:45 PM, Olivier Cochard-Labbé wrote:


On Thu, Aug 29, 2024 at 8:52 PM mike tancsa  wrote:

But this would kill all UDP fragments.  If the host has some other
UDP
application that needs to deal with fragmented packets, is there a
way
to get around that and only drop packets with a certain port in the
first fragment ?


When a packet is fragmented, only the IP header (not the UDP header 
that includes the port number) is copied for all subsequent fragmented 
packets.
To fix this behavior, you can instruct the firewall to reassemble the 
packet before performing UDP/TCP port filtering.
Refer to the ipfw(4) man page on the "reass" keyword, which provides 
the following example:

ipfw add reass all from any to any in

I hope this helps!



Thanks very much, it does!  Under DDoS attack, how "expensive" would 
this be I noticed there are some default queue limits that probably 
would be exhausted fairly quickly.  I might look instead for this use 
case to use the chelsio NIC rules (via cxgbetool) and just drop with 
something like this


cxgbetool t5nex0 filter 10  sip  0.0.0.0/0 sport 53 dip 192.168.1.1/32 
proto 17  action drop
cxgbetool t5nex0 filter 11 sip 0.0.0.0/0 dip 192.168.1.1/32 proto 17 
frag 1 action drop


to protect the customer downstream and then get rid of rule 11 once the 
pps rate drops back to normal.


    ---Mike


Ethernet device with shared mdio

2024-09-06 Thread Mike Belanger
The following device tree specifies a shared mdio.
The ffec driver uses miibus.
When there is a shared mdio, one of the device instances will not be able to 
properly configure the PHY, as it needs to use the other devices resource to 
read/write the PHY.

&fec1 {
pinctrl-names = "default";
pinctrl-0 = <&pinctrl_fec1>;
phy-mode = "rgmii-id";
phy-handle = <ðphy0>;
fsl,magic-packet;
status = "okay";

mdio {
#address-cells = <1>;
#size-cells = <0>;

ethphy0: ethernet-phy@0 {
compatible = 
"ethernet-phy-ieee802.3-c22";
reg = <0>;
};

ethphy1: ethernet-phy@1 {
compatible = 
"ethernet-phy-ieee802.3-c22";
reg = <1>;
};
};
};

&fec2 {
pinctrl-names = "default";
pinctrl-0 = <&pinctrl_fec2>;
phy-mode = "rgmii-txid";
phy-handle = <ðphy1>;
phy-supply = <®_fec2_supply>;
nvmem-cells = <&fec_mac1>;
nvmem-cell-names = "mac-address";
rx-internal-delay-ps = <2000>;
fsl,magic-packet;
status = "okay";
};

Does FreeBSD have any plans for supporting hardware that specifies a shared 
mdio in the dtb?
Just knowing the general approach being considered would be helpful.

--
This transmission (including any attachments) may contain confidential 
information, privileged material (including material protected by the 
solicitor-client or other applicable privileges), or constitute non-public 
information. Any use of this information by anyone other than the intended 
recipient is prohibited. If you have received this transmission in error, 
please immediately reply to the sender and delete this information from your 
system. Use, dissemination, distribution, or reproduction of this transmission 
by unintended recipients is not authorized and may be unlawful.


Re: Ethernet device with shared mdio

2024-09-13 Thread Mike Belanger
Thank you for the response and for sharing your scenario.

We’ve also hacked up the cgem and the ffec driver to support a shared mdio.
That was not too difficult, but we have a new scenario where the mdio is now 
being shared between two different devices that use different drivers (ffec and 
eqos).
This presents a few extra challenges.

I was hoping that FreeBSD may have considered supporting a shared mdio.  We can 
come up with something, but if there is an existing architecture/approach in 
the works…we would like to use a consistent approach.  At first glance, 
miiproxy did not seem like a fit.

I do not have the hardware.  I am trying to help somebody else with this.  I 
have seen the dtb.
It’s a Variscite DAR-MX8M-PLUS.

Regards,
Mike.

From: owner-freebsd-...@freebsd.org  on behalf 
of Milan Obuch 
Date: Friday, September 13, 2024 at 3:08 AM
To: freebsd-net@freebsd.org 
Subject: Re: Ethernet device with shared mdio
CAUTION - This email is from an external source. Please be cautious with links 
and attachments. (go/taginfo)

On Fri, 6 Sep 2024 18:03:39 +
Mike Belanger  wrote:

> The following device tree specifies a shared mdio.
> The ffec driver uses miibus.
> When there is a shared mdio, one of the device instances will not be
> able to properly configure the PHY, as it needs to use the other
> devices resource to read/write the PHY.
>
> &fec1 {pinctrl-names = "default";
>pinctrl-0 = <&pinctrl_fec1>;
>phy-mode = "rgmii-id";
>phy-handle = <ðphy0>;
>fsl,magic-packet;
>status = "okay";
>
>mdio {
>  #address-cells = <1>;
>  #size-cells = <0>;
>
>  ethphy0: ethernet-phy@0 {
>   compatible = 
> "ethernet-phy-ieee802.3-c22"; reg = <0>;
>  };
>
>  ethphy1: ethernet-phy@1 {
>   compatible = 
> "ethernet-phy-ieee802.3-c22"; reg = <1>;
>  };
>};
> };
>
> &fec2 {
>pinctrl-names = "default";
>pinctrl-0 = <&pinctrl_fec2>;
>phy-mode = "rgmii-txid";
>phy-handle = <ðphy1>;
>phy-supply = <®_fec2_supply>;
>nvmem-cells = <&fec_mac1>;
>nvmem-cell-names = "mac-address";
>rx-internal-delay-ps = <2000>;
>fsl,magic-packet;
>status = "okay";
> };
>
> Does FreeBSD have any plans for supporting hardware that specifies a
> shared mdio in the dtb? Just knowing the general approach being
> considered would be helpful.
>

I can't speak for FreeBSD project, I just can share my experience with
similar case. It is described in my post to hackers mailing list (see
https://urldefense.com/v3/__https://lists.freebsd.org/archives/freebsd-hackers/2021-December/000649.html__;!!JoeW-IhCUkS0Jg!fv0DHFN5Xb4FbKwre1H4UDCUvbmhAoO1y5HgQiDkN6wuv2t3B4pyS1akuKuCn6ZqO1AfbrCaFsVsJibdfui4KfJQGw$<https://urldefense.com/v3/__https:/lists.freebsd.org/archives/freebsd-hackers/2021-December/000649.html__;!!JoeW-IhCUkS0Jg!fv0DHFN5Xb4FbKwre1H4UDCUvbmhAoO1y5HgQiDkN6wuv2t3B4pyS1akuKuCn6ZqO1AfbrCaFsVsJibdfui4KfJQGw$>
for details), unfortunately, no response received. Another attempt to
get some attention a week later on net mailing list was done, see
https://urldefense.com/v3/__https://lists.freebsd.org/archives/freebsd-net/2021-December/001114.html__;!!JoeW-IhCUkS0Jg!fv0DHFN5Xb4FbKwre1H4UDCUvbmhAoO1y5HgQiDkN6wuv2t3B4pyS1akuKuCn6ZqO1AfbrCaFsVsJibdfujKN3_xCA$<https://urldefense.com/v3/__https:/lists.freebsd.org/archives/freebsd-net/2021-December/001114.html__;!!JoeW-IhCUkS0Jg!fv0DHFN5Xb4FbKwre1H4UDCUvbmhAoO1y5HgQiDkN6wuv2t3B4pyS1akuKuCn6ZqO1AfbrCaFsVsJibdfujKN3_xCA$>
for the post, with no response either.

As you see, my case was similar, just the mdio block was attached to
second controller. This makes it a bit more problematic - you can't use
mdio controller before being initialized, naturally.

I was not able to use miiproxy approach as noted in my post to hackers
mailing list, additionally, miiproxy was removed from the tree with
MIPS arch some time later. I resolved the issue by modifying cgem driver
and mii layer. This was just a proof of concept with some hacks, but I
was able to use both ports with proper link state change detection. I
did not continue the work because vendor changed hardware design and
there was no shared mdio anymore.

If you are interested I can dig for the sources, big part of my changes
would not be necessary, just the idea of decoupling MDIO and MII
interfaces still applies, I think. By the way, which board are you
working on? Is it accessible for general audience?

Regards,
Milan

--

Re: vlan with modified MAC fails to communicate

2013-03-30 Thread Mike Karels
> As for if_vlan.c, I verified that in the case when NIC's MAC adress is
> modified, it updates the values in the vlan to keep them in sync. However,
> I don't see this behavior when the changes are performed over the vlan.

There is no existing driver API to add MAC addresses in FreeBSD, which is
what would be required to support different MAC addresses for different
VLANs.  I have added such an API @work (McAfee, in our firewall clusters),
but it is limited to a small number of drivers and exactly one additional
MAC in the current implementation.  A more general implementation would
support varying numbers of MACs per NIC before dropping into promiscuous
mode.

> >From what I see, looks like this behavior from FreeBSD side is expected and
> the changes should be incorporated to my NIC.

I'm not sure what you mean, but there is no existing code to propagate
a MAC change on a VLAN to its parent device.  I think it is a bug that
a change appears to work.

> Set the NIC to promisc mode whenever both MAC addresses are not equal looks
> like a good workaround, however try to work out some improvement in the
> packet filtering method looks more like a fix to me. What holds me back is
> the inherent loss of performance in promisc mode, but I need to see if I'm
> able to live with this overhead :)

This may not be so bad on a switched network.  Current drivers give you
all multicasts as well as all unicasts in promiscuous mode, but you really
don't need all multicasts in this case.

Mike
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: forwarding/ipfw/pf evolution (in pps) on -current

2013-04-24 Thread Mike Tancsa
On 4/24/2013 6:45 AM, Olivier Cochard-Labbé wrote:
> # Why all these benchs ? #
> 
> I've found performance regression regarding packet forwarding/ipfw/pf
> speed on -current comparing to 9.1 on my old server.

BTW, how much of a drop in performance as compared to 9.1 ?


---Mike
-- 
---
Mike Tancsa, tel +1 519 651 3400
Sentex Communications, m...@sentex.net
Providing Internet services since 1994 www.sentex.net
Cambridge, Ontario Canada   http://www.tancsa.com/
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: pf performance?

2013-04-26 Thread Mike Tancsa
On 4/26/2013 12:22 PM, Olivier Cochard-Labbé wrote:
> On Fri, Apr 26, 2013 at 3:42 PM, Gleb Smirnoff  wrote:
>>
>> In FreeBSD 10 pf is no longer under single lock. On your hardware,
>> I'd expect a measurable performance gain if you migrate to 10.
> 
> Compairing 9.1 and current (249908) on my new test-server (HP ProLiant
> DL320 G5, dual-core Xeon 3050, dual Intel NIC).
> Like usual: one unidirectional flow of small packets, values in
> packet-per-seconds:
> 
> x 9.1
> + current
> N   Min   MaxMedian   AvgStddev
> x   5379991381508381229  380892.6 667.69926
> +   5332833335502334726  334223.2 1142.8266
> Difference at 95.0% confidence
> -46669.4 +/- 1364.98
> -12.2526% +/- 0.358363%
> (Student's t, pooled s = 935.915)


Is that because pf is slower on a single flow, or packet forwarding in
general is slower on HEAD ?  How different is 9.1 and HEAD in just
forwarding performance?

---Mike



-- 
---
Mike Tancsa, tel +1 519 651 3400
Sentex Communications, m...@sentex.net
Providing Internet services since 1994 www.sentex.net
Cambridge, Ontario Canada   http://www.tancsa.com/
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: ppp(8) and inbound IP connections

2013-05-07 Thread Mike Tancsa
On 5/7/2013 3:24 PM, Matthias Apitz wrote:
> 
> That is my understanding as well, but why they claim that they do
> support incoming connections?

As Joe Holden said before, to support incoming connections, you probably
need to use a different APN, or pay for that service.  The carriers here
in Canada do that. It would not be manageable otherwise by the carrier.

    ---Mike

-- 
---
Mike Tancsa, tel +1 519 651 3400
Sentex Communications, m...@sentex.net
Providing Internet services since 1994 www.sentex.net
Cambridge, Ontario Canada   http://www.tancsa.com/
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: Please implement patch in PR180893

2013-07-27 Thread Mike Karels
> Sure, but it would be nice to file bugs with VMware and such to ensure
> they fix their bugs.

fwiw, I use IPv6 with recent versions of ESXi and VMware Workstation,
and have not seen this problem.  I'm curious if the problem is in old
versions or with particular configurations.

> Anyone have any issue with this? The issue I have is the if_printf(),
> it should be rate limited at the very least. It would also be nice to
> have a different counter to reflect that kind of dropped packet..

Agreed to both; I'd rather reserve if_ierrors for NIC-reported errors.
I also think the message should say "from my MAC address" (vs IP).

> 2c,

2c more,

Mike


On 27 July 2013 13:49, Zaphod Beeblebrox  wrote:
> I'd like to advocate implementing
> http://www.freebsd.org/cgi/query-pr.cgi?pr=180893
>
> Quoting the PR:
>
> Some errant network equipment (including the simulation of a network
> by VMware, as an example) will reflect back multicast packets to the sender.
> This breaks protocols such as DAD and makes IPv6 nearly impossible to use
> on these networks.
>
> Now, the argument could be made to fix these network elements, but
> there is an elegant solution that improves the quality of FreeBSD: To refuse
> packets that have a source ethernet address of the receiving interface. If
> you consider this notion, you can quickly and easily accept that an
> interface
> should never "receive" a packet from it's own MAC address.
>
> This behaviour mirrors Linux behavior and I assume Windows behavior.
>
> I won't claim to be experienced in kernel matters, but I chose the
> location for this modification to allow BPF to "see" the packets (for
> network diagnosis). This test, however, could be moved within this function
> or even given a sysctl knob.
> ___
> freebsd-net@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


TSO help or hindrance ? (was Re: TSO and FreeBSD vs Linux)

2013-09-04 Thread Mike Tancsa
On 9/4/2013 8:50 AM, Rick Macklem wrote:
> David Wolfskill wrote:
>>
>>
>> I noticed that when I tried to write files to NFS, I could write
>> small
>> files OK, but larger ones seemed to ... hang.
>> * "ifconfig -v em0" showed flags TSO4 & VLAN_HWTSO turned on.
>> * "sysctl net.inet.tcp.tso" showed "1" -- enabled.
>>
>> As soon as I issued "sudo net.inet.tcp.tso=0" ... the copy worked
>> without
>> a hitch or a whine.  And I was able to copy all 117709618 bytes, not
>> just
>> 2097152 (2^21).
>>
>> Is the above expected?  It came rather as a surprise to me.
>>
> Not surprising to me, I'm afraid. When there are serious NFS problems
> like this, it is often caused by a network fabric issue and broken
> TSO is at the top of the list w.r.t. cause.


I was just experimenting a bit with iSCSI via FreeNAS and was a little
disappointed at the speeds I was getting. So, I tried disabling tso on
both boxes and it did seem to speed things up a bit.  Data and testing
methods attached in a txt file.

I did 3 cases.

Just boot up FreeNAS and the initiator without tweaks.  That had the
worst performance.
disable tso on the nic as well as via sysctl on both boxes. That had the
best performance.
re-enable tso on both boxes. That had better performance than the first
case, but still not as good as totally disabling it.  I am guessing
something is not quite being re-enabled properly ? But its different
than the other two cases ?!?

tgt is FreeNAS-9.1.1-RELEASE-x64 (a752d35) and initiator is r254328 9.2
AMD64

The FreeNAS box has 16G of RAM, so the file is being served out of cache
as gstat shows no activity when sending out the file



---Mike


-- 
---
Mike Tancsa, tel +1 519 651 3400
Sentex Communications, m...@sentex.net
Providing Internet services since 1994 www.sentex.net
Cambridge, Ontario Canada   http://www.tancsa.com/

3 data files. 

notso = tso disabled on tgt and initiator
tso-boot = initiator is rebooted and tests are run
tso = tso disabled and then re-enabled. For whatever reason its not as bad as 
post boot, but still worse than no tso

0{mdttestbox}# ministat notso tso-boot 
x notso
+ tso-boot
+--+
|+  
   |
|   +++ 
  x|
|+  
   x  x  xx|
|  |AM| 
 |___M__A_||
+--+
N   Min   MaxMedian   AvgStddev
x   9  69086843  71749085  69475320  69810666  887628.9
+   9  55846191  56473897  56357956  56302165  208843.3
Difference at 95.0% confidence
-1.35085e+07 +/- 644386
-19.3502% +/- 0.923048%
(Student's t, pooled s = 644787)
0{mdttestbox}# 


# ministat  tso notso 
x tso
+ notso
+--+
|  x  + 
   |
|x  x   x + x + ++ ++ + 
  +|
| 
|MA___||___M___A_|  
 |
+--+
N   Min   MaxMedian   AvgStddev
x   9  68632067  69126677  68779201  68808473 142050.01
+   9  69086843  71749085  69475320  69810666  887628.9
Difference at 95.0% confidence
1.00219e+06 +/- 635239
1.4565% +/- 0.923199%
(Student's t, pooled s = 635635)

0{mdttestbox}# cat tso
68779201
68734143
68915705
68752827
68782212
69126677
68828520
68724901
68632067
0{mdttestbox}# cat notso 
71749085
69256608
69097532
69086843
70587459
69179672
69475320
69754511
70108963

0{mdttestbox}# cat tso-boot 
55846191
56314173
56284204
56095729
56466769
56446535
56357956
56434027
56473897
0{mdttestbox}# 



0{mdttestbox}# cat t.sh
#!/bin/sh

command="dd if=/mnt/test of=/dev/null bs=4096k"
for i in `jot 9 1`;do
mount /dev/da0a /mnt
eval $command 2>&1 | grep bytes | awk -F"[\( ]" '{print $8}&

Re: TSO help or hindrance ? (was Re: TSO and FreeBSD vs Linux)

2013-09-10 Thread Mike Tancsa
On 9/10/2013 6:42 PM, Barney Cordoba wrote:
> NFS has been broken since Day 1, so lets not come to conclusions about
> anything
> as it relates to NFS.

iSCSI is NFS ?

    ---Mike

> 
> BC
> 
> ----
> *From:* Mike Tancsa 
> *To:* Rick Macklem 
> *Cc:* FreeBSD Net ; David Wolfskill 
> *Sent:* Wednesday, September 4, 2013 11:26 AM
> *Subject:* TSO help or hindrance ? (was Re: TSO and FreeBSD vs Linux)
> 
> On 9/4/2013 8:50 AM, Rick Macklem wrote:
>> David Wolfskill wrote:
>>>
>>>
>>> I noticed that when I tried to write files to NFS, I could write
>>> small
>>> files OK, but larger ones seemed to ... hang.
>>> * "ifconfig -v em0" showed flags TSO4 & VLAN_HWTSO turned on.
>>> * "sysctl net.inet.tcp.tso" showed "1" -- enabled.
>>>
>>> As soon as I issued "sudo net.inet.tcp.tso=0" ... the copy worked
>>> without
>>> a hitch or a whine.  And I was able to copy all 117709618 bytes, not
>>> just
>>> 2097152 (2^21).
>>>
>>> Is the above expected?  It came rather as a surprise to me.
>>>
>> Not surprising to me, I'm afraid. When there are serious NFS problems
>> like this, it is often caused by a network fabric issue and broken
>> TSO is at the top of the list w.r.t. cause.
> 
> 
> I was just experimenting a bit with iSCSI via FreeNAS and was a little
> disappointed at the speeds I was getting. So, I tried disabling tso on
> both boxes and it did seem to speed things up a bit.  Data and testing
> methods attached in a txt file.
> 
> I did 3 cases.
> 
> Just boot up FreeNAS and the initiator without tweaks.  That had the
> worst performance.
> disable tso on the nic as well as via sysctl on both boxes. That had the
> best performance.
> re-enable tso on both boxes. That had better performance than the first
> case, but still not as good as totally disabling it.  I am guessing
> something is not quite being re-enabled properly ? But its different
> than the other two cases ?!?
> 
> tgt is FreeNAS-9.1.1-RELEASE-x64 (a752d35) and initiator is r254328 9.2
> AMD64
> 
> The FreeNAS box has 16G of RAM, so the file is being served out of cache
> as gstat shows no activity when sending out the file
> 
> 
> 
> ---Mike
> 
> 
> -- 
> ---
> Mike Tancsa, tel +1 519 651 3400
> Sentex Communications, m...@sentex.net <mailto:m...@sentex.net>
> Providing Internet services since 1994 www.sentex.net
> Cambridge, Ontario Canada  http://www.tancsa.com/
> 
> ___
> freebsd-net@freebsd.org <mailto:freebsd-net@freebsd.org> mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org
> <mailto:freebsd-net-unsubscr...@freebsd.org>"
> 


-- 
---
Mike Tancsa, tel +1 519 651 3400
Sentex Communications, m...@sentex.net
Providing Internet services since 1994 www.sentex.net
Cambridge, Ontario Canada   http://www.tancsa.com/
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: TSO help or hindrance ? (was Re: TSO and FreeBSD vs Linux)

2013-09-10 Thread Mike Tancsa
On 9/10/2013 7:04 PM, Rick Macklem wrote:
> Mike Tancsa wrote:
>> On 9/10/2013 6:42 PM, Barney Cordoba wrote:
>>> NFS has been broken since Day 1, so lets not come to conclusions
>>> about
>>> anything
>>> as it relates to NFS.
>>
>> iSCSI is NFS ?
>>
> It would be really nice if you could try trasz`s new iSCSI stack and
> see how well it works. (I, for one, am hoping it makes it into 10.0,
> but it may be too late.)

I was only doing limited testing of iSCSI both as target and initiator.
 I was a little disappointed at the slow speeds I was getting.  Noticing
the thread about TSO, I thought it would be interesting to test and sure
enough it did make a difference.

IIRC, the new iSCSI stack is currently tested more for correctness than
performance?


---Mike


-- 
---
Mike Tancsa, tel +1 519 651 3400
Sentex Communications, m...@sentex.net
Providing Internet services since 1994 www.sentex.net
Cambridge, Ontario Canada   http://www.tancsa.com/
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: Free book draft: IPv6 for IPv4 Experts

2013-09-23 Thread Mike Tancsa
On 9/23/2013 7:50 AM, Yar Tikhiy wrote:
> 
> The project page is: https://sites.google.com/site/yartikhiy/home/ipv6book
> 
> An e-reader friendly PDF as well as a conventional A4 size PDF is available.
> 
> Hoping you will enjoy the reading as much as I have enjoyed the writing.

Wow!  I just had a look at the TOC and it looks like a great addition to
the spare resources that are out there. I will certainly have a look
through it in the coming days. Thanks for sharing with the community!

---Mike



-- 
---
Mike Tancsa, tel +1 519 651 3400
Sentex Communications, m...@sentex.net
Providing Internet services since 1994 www.sentex.net
Cambridge, Ontario Canada   http://www.tancsa.com/
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: mpd5/Netgraph issues after upgrading to 7.4

2014-03-09 Thread Mike Tancsa

On 3/9/2014 7:33 AM, Przemyslaw Frasunek wrote:

I've seen that Mike reported similar issues in October
(http://lists.freebsd.org/pipermail/freebsd-stable/2013-October/075552.html).
Did you managed to resolve it?

 I worked around the crash by removing ipv6 from the kernel.  The box has
been functioning without a crash since then.


Hi,

FYI -- after upgrade to 9-STABLE no further crashes occurred, even with IPv6
enabled.


What sort of uptime have you seen with ipv6 enabled ?

    ---Mike



--
---
Mike Tancsa, tel +1 519 651 3400
Sentex Communications, m...@sentex.net
Providing Internet services since 1994 www.sentex.net
Cambridge, Ontario Canada   http://www.tancsa.com/
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: mpd5/Netgraph issues after upgrading to 7.4

2014-03-14 Thread Mike Tancsa

On 3/14/2014 1:00 PM, Przemyslaw Frasunek wrote:

FYI -- after upgrade to 9-STABLE no further crashes occurred, even with IPv6
enabled.

What sort of uptime have you seen with ipv6 enabled ?


Now it's 19 days and still no crash occurred.


I would sometimes get a month on a box with ~ 500 users

    ---Mike

--
---
Mike Tancsa, tel +1 519 651 3400
Sentex Communications, m...@sentex.net
Providing Internet services since 1994 www.sentex.net
Cambridge, Ontario Canada   http://www.tancsa.com/
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: mpd5/Netgraph issues after upgrading to 7.4

2014-04-15 Thread Mike Tancsa

On 3/14/2014 1:00 PM, Przemyslaw Frasunek wrote:

FYI -- after upgrade to 9-STABLE no further crashes occurred, even with IPv6
enabled.

What sort of uptime have you seen with ipv6 enabled ?


Now it's 19 days and still no crash occurred.



Hi,
Has all been stable still with ipv6 enabled ?

    ---Mike



--
---
Mike Tancsa, tel +1 519 651 3400
Sentex Communications, m...@sentex.net
Providing Internet services since 1994 www.sentex.net
Cambridge, Ontario Canada   http://www.tancsa.com/
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: igb performance/load udp issue

2011-12-21 Thread Mike Tancsa
On 12/21/2011 1:46 AM, Jack Vogel wrote:
> I was fighting with UDP issues before the latest checkin, so you should
> look at THAT version, 2.3.1 in HEAD please.

Hi Jack,
Is there a stand alone version of 2.3.1 that we can try on RELENG_9 and
RELENG_8 ?

    ---Mike


-- 
---
Mike Tancsa, tel +1 519 651 3400
Sentex Communications, m...@sentex.net
Providing Internet services since 1994 www.sentex.net
Cambridge, Ontario Canada   http://www.tancsa.com/
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: Firewall Profiling.

2011-12-27 Thread Mike Tancsa
On 12/27/2011 6:36 AM, Alexander V. Chernikov wrote:
>> Is  IPFW  efficient  enough  to  firewall  2x10GE  (in+out) interfaces
>> without  much  latency  increase,  when  running  on  modern  hardware
>> with Intel NICs? Majority of processing tasks would probably be setfib
>> according to matches in tables.
> IPFW seems to add more or less constant overhead per rule. In our setup,
> ~20 rules increase load by 100% (one core).  We are able to reach 10GE
> (1.1mpps) on some routers with most packets travelling 8-10 ipfw rules.
> However, even with ipfw add 1 allow ip from any to any
> 1.1 mpps routing utilizes E5645 by more that 80%. (with IGP routes in
> rtable only). YMMV, but 2x10G is too much at the moment even without ipfw.


Dont some of the modern 10G adapters support filtering in the card
itself ?  eg cxgbe.

    ---Mike



-- 
---
Mike Tancsa, tel +1 519 651 3400
Sentex Communications, m...@sentex.net
Providing Internet services since 1994 www.sentex.net
Cambridge, Ontario Canada   http://www.tancsa.com/
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Broadcom 10Gbps Ethernet driver (bxe)

2012-01-17 Thread Mike Karels
Has anyone had any success using the bxe driver with FreeBSD 9.0 or
pre-releases?  If so, are you using BCM57710 or BCM57711[E]?  We are
trying to use the 57710 without any success; it does not receive
unicast packets, just broadcast or multicast.

Thanks,
Mike
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: stateful firewall implementation in FreeBSD

2012-01-26 Thread Mike Tancsa
On 1/26/2012 12:24 PM, satish amara wrote:
> Hi,
> I have question regarding stateful firewall implementation of FreeBSD.
> IPF has  stateful “keep state” option.

Hi,
Take a look at pf, not ipf. ipf is not really maintained or used much
any more under FreeBSD.  With respect to dealing with congestion, there
are many params you can tune in pf.  Take a look at the man pages for
pf.conf for details as you can control how this situation is dealt with
to some degree.

    ---Mike

-- 
---
Mike Tancsa, tel +1 519 651 3400
Sentex Communications, m...@sentex.net
Providing Internet services since 1994 www.sentex.net
Cambridge, Ontario Canada   http://www.tancsa.com/
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: em0 hangs on 8-STABLE again

2012-01-29 Thread Mike Tancsa
On 1/29/2012 4:38 AM, Lev Serebryakov wrote:
> Hello, Freebsd-net.
> 
>   My home server lost connection on em0 this night again. It was
> persistent problem some times ago, but with version 7.2.3 it is first
> time, but with worse symptoms.

7.3.0 from HEAD is quite stable for me.  Hopefully it will be MFC'd soon :)


---Mike
-- 
---
Mike Tancsa, tel +1 519 651 3400
Sentex Communications, m...@sentex.net
Providing Internet services since 1994 www.sentex.net
Cambridge, Ontario Canada   http://www.tancsa.com/
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: em0 hangs on 8-STABLE again

2012-02-01 Thread Mike Tancsa
On 1/29/2012 1:21 PM, Jack Vogel wrote:
> No, I told Mike I'd get it into 8.x, have just been busy, but will try
> and get it pushed up in the queue.

Thanks Jack, I see its now MFC'd into RELENG_8!

em1:  port 0x2000-0x201f mem
0xb410-0xb411,0xb412-0xb4123fff irq 16 at device 0.0 on pci11
em1: Using MSIX interrupts with 3 vectors
em1: [ITHREAD]
em1: [ITHREAD]
em1: [ITHREAD]

Just curious, does RELENG_9 have this version as well ?

---Mike



-- 
---
Mike Tancsa, tel +1 519 651 3400
Sentex Communications, m...@sentex.net
Providing Internet services since 1994 www.sentex.net
Cambridge, Ontario Canada   http://www.tancsa.com/
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: IGB freezes after about 2 weeks of uptime

2012-02-22 Thread Mike Tancsa
I dont think the driver changes from HEAD were ever MFC'd to 9.x.  Only
to 8.x

    ---Mike

On 2/22/2012 1:23 PM, Darren Baginski wrote:
> Same problem on
> FreeBSD srv-4-2.lab.local 9.0-STABLE FreeBSD 9.0-STABLE #2: Wed Feb 22 
> 18:10:53 UTC 2012 r...@srv-4-2.lab.local:/usr/obj/usr/src/sys/GENERIC  
> amd64
> 
> 16.02.2012, 02:27, "Jack Vogel" :
>> And assuming its from the release, please upgrade it to HEAD and try again.
>>
>> Jack
>>
>> On Wed, Feb 15, 2012 at 2:14 PM, Adrian Chadd  wrote:
>>> are you running the driver from that release, or the -HEAD driver?
>>>
>>> adrian
>>> ___
>>> freebsd-net@freebsd.org mailing list
>>> http://lists.freebsd.org/mailman/listinfo/freebsd-net
>>> To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
> ___
> freebsd-net@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
> 
> 


-- 
---
Mike Tancsa, tel +1 519 651 3400
Sentex Communications, m...@sentex.net
Providing Internet services since 1994 www.sentex.net
Cambridge, Ontario Canada   http://www.tancsa.com/
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: Intel 82574L interface wedging - em7.3.2/8.2-STABLE

2012-03-16 Thread Mike Tancsa
On 3/16/2012 11:52 AM, Adrian Chadd wrote:
> Can someone please just send me some recent em/igb hardware? I'll sit
> down and find ways to break things and help Jack fix them.
> 
> I've been knee deep in this crap with ath(4) so I'm well versed now in
> the art of "making your NIC and network stack not angry."

The 82574L is not that common on NICs and tends to be on server
motherboards.  igb is easy enough to source.

---Mike

> 
> 
> 
> Adrian
> ___
> freebsd-net@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
> 
> 


-- 
---
Mike Tancsa, tel +1 519 651 3400
Sentex Communications, m...@sentex.net
Providing Internet services since 1994 www.sentex.net
Cambridge, Ontario Canada   http://www.tancsa.com/
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


CARP Active-Active

2012-03-21 Thread Mike Barnard
Hi,

I hope this is the right place to ask about this (didn't think PF list
would be ideal for this question).

I have been reading on CARP in active-active mode and was wondering whether
this is possible in FreeBSD. It is possible to get it done on OpenBSD (
www.kernel-panic.it/openbsd/carp/carp4.html#carp-4.2.2)?

Does FreeBSD yet have IP load balacing on CARP? Are there plans to do this
on FreeBSD?


-- 
Mike

Of course, you might discount this possibility, but remember that one in a
million chances happen 99% of the time.

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: Intel 82574L interface wedging - em7.3.2/8.2-STABLE

2012-03-23 Thread Mike Tancsa
On 3/20/2012 2:57 PM, John Baldwin wrote:
>>> TX when link becomes active.  I've also updated it to fix resume for em
>>> and igb to DTRT when buf_ring is used, and to not include old-style start
>>> routines at all when using multiq.  It is at
>>> http://www.freebsd.org/~jhb/patches/e1000_txeof2.patch
>> Thank for the patch sirs, so far it does look like it did the trick.
>> I'll know for certain here in a few days if I'm still in the clear.
>> I'm guessing after it goes through some more testing it'll be too late
>> to slip it into 8.3?
> 
> Yes, this is too late for 8.3, but thanks for testing!

Hi,
Is there a RELENG_8 version of this patch ? I have a server that used to
shows this issue quite a bit, but has not since 7.3.2. I would be happy
to stress it on the box.  The patch above does not apply cleanly due to
the netmap diffs, but I can manually merge if thats the only difference.

em1:  port 0x2000-0x201f mem
0xb410-0xb411,0xb412-0xb4123fff irq 16 at device 0.0 on pci11
em1: Using MSIX interrupts with 3 vectors
em1: [ITHREAD]
em1: [ITHREAD]
em1: [ITHREAD]
em1: Ethernet address: 00:15:17:ed:68:a4

em1@pci0:11:0:0:class=0x02 card=0x34ec8086 chip=0x10d38086
rev=0x00 hdr=0x00
vendor = 'Intel Corporation'
device = 'Intel 82574L Gigabit Ethernet Controller (82574L)'
class  = network
subclass   = ethernet
bar   [10] = type Memory, range 32, base 0xb410, size 131072,
enabled
bar   [18] = type I/O Port, range 32, base 0x2000, size 32, enabled
bar   [1c] = type Memory, range 32, base 0xb412, size 16384, enabled
cap 01[c8] = powerspec 2  supports D0 D3  current D0
cap 05[d0] = MSI supports 1 message, 64 bit
cap 10[e0] = PCI-Express 1 endpoint max data 128(256) link x1(x1)
cap 11[a0] = MSI-X supports 5 messages in map 0x1c enabled
ecap 0001[100] = AER 1 0 fatal 0 non-fatal 0 corrected
ecap 0003[140] = Serial 1 001517ed68a4


-- 
---
Mike Tancsa, tel +1 519 651 3400
Sentex Communications, m...@sentex.net
Providing Internet services since 1994 www.sentex.net
Cambridge, Ontario Canada   http://www.tancsa.com/
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: IGB freezes after about 2 weeks of uptime

2012-04-03 Thread Mike Tancsa
Hi,
Try the driver from HEAD. It has a number of fixes in it to both the
igb and em drivers that is not yet in RELENG_9 nor 8

http://lists.freebsd.org/pipermail/svn-src-head/2012-March/035888.html

---Mike

On 4/3/2012 10:19 AM, Darren Baginski wrote:
> Still getting same errors on
>  FreeBSD srv-4-2.lab.local 9.0-STABLE FreeBSD 9.0-STABLE #3: Mon Apr  2
> 19:07:20 UTC 2012
> r...@srv-4-2.lab.local:/usr/obj/usr/src/sys/GENERIC  amd64
> It's only me or it's known problem ?
>  
> 22.02.2012, 23:46, "Jack Vogel" :
>> Mike is correct, 8.3 was looming, it is important to a lot of my
>> customers so it
>> has taken priority, 9 stable will be coming...
>>
>> Jack
>>
>>
>> On Wed, Feb 22, 2012 at 10:55 AM, Mike Tancsa > <mailto:m...@sentex.net>> wrote:
>>
>> I dont think the driver changes from HEAD were ever MFC'd to 9.x.
>>  Only
>> to 8.x
>>
>>---Mike
>>
>> On 2/22/2012 1:23 PM, Darren Baginski wrote:
>> > Same problem on
>> > FreeBSD srv-4-2.lab.local 9.0-STABLE FreeBSD 9.0-STABLE #2: Wed
>> Feb 22 18:10:53 UTC 2012 r...@srv-4-2.lab.local
>> <mailto:r...@srv-4-2.lab.local>:/usr/obj/usr/src/sys/GENERIC  amd64
>> >
>> > 16.02.2012, 02:27, "Jack Vogel" > <mailto:jfvo...@gmail.com>>:
>> >> And assuming its from the release, please upgrade it to HEAD
>> and try again.
>> >>
>> >> Jack
>> >>
>> >> On Wed, Feb 15, 2012 at 2:14 PM, Adrian Chadd
>> mailto:adr...@freebsd.org>> wrote:
>> >>> are you running the driver from that release, or the -HEAD driver?
>> >>>
>> >>> adrian
>> >>> ___
>> >>> freebsd-net@freebsd.org <mailto:freebsd-net@freebsd.org>
>> mailing list
>> >>> http://lists.freebsd.org/mailman/listinfo/freebsd-net
>> >>> To unsubscribe, send any mail to
>> "freebsd-net-unsubscr...@freebsd.org
>> <mailto:freebsd-net-unsubscr...@freebsd.org>"
>> > ___
>> > freebsd-net@freebsd.org <mailto:freebsd-net@freebsd.org> mailing
>> list
>> > http://lists.freebsd.org/mailman/listinfo/freebsd-net
>> > To unsubscribe, send any mail to
>> "freebsd-net-unsubscr...@freebsd.org
>> <mailto:freebsd-net-unsubscr...@freebsd.org>"
>> >
>> >
>>
>> --
>> ---
>> Mike Tancsa, tel +1 519 651 3400 
>> Sentex Communications, m...@sentex.net <mailto:m...@sentex.net>
>> Providing Internet services since 1994 www.sentex.net
>> <http://www.sentex.net>
>> Cambridge, Ontario Canada   http://www.tancsa.com/


-- 
---
Mike Tancsa, tel +1 519 651 3400
Sentex Communications, m...@sentex.net
Providing Internet services since 1994 www.sentex.net
Cambridge, Ontario Canada   http://www.tancsa.com/
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: mpd5/Netgraph issues after upgrading to 7.4

2012-06-15 Thread Mike Tancsa
On 6/15/2012 4:31 PM, Gleb Smirnoff wrote:
> On Fri, Jun 15, 2012 at 01:33:05PM +0200, Przemyslaw Frasunek wrote:
> P> unfortunately, one of my mpd5 PPPoE access servers started panicing every 
> few
> P> hours.
> P> 
> P> I'm running recent 8.3-STABLE (as of 23th May) with WITNESS, INVARIANTS and

> I suspect this isn't related to netgraph, but to IPv6 since prelist_remove()
> is found in netinet6/nd6_rtr.c.
> 
> Several times I looked into ND code and found lots of race prone code there.
> May be some was recently fixed by bz@, but definitely not merged to stable/8.

There were a bunch of commits / fixes by BZ on the 5th of June.  Perhaps
try updating to RELENG_8 as of today. If you are not using IPv6, perhaps
disable for a day to see if that makes a difference stability wise ?  It
did for me back in Nov when running with v6 on an LNS was not stable.

http://lists.freebsd.org/pipermail/svn-src-stable-8/2012-June/007555.html

---Mike



> 


-- 
---
Mike Tancsa, tel +1 519 651 3400
Sentex Communications, m...@sentex.net
Providing Internet services since 1994 www.sentex.net
Cambridge, Ontario Canada   http://www.tancsa.com/
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: mpd5/Netgraph issues after upgrading to 7.4

2012-06-18 Thread Mike Tancsa
On 6/15/2012 5:57 PM, Przemyslaw Frasunek wrote:
>> I suspect this isn't related to netgraph, but to IPv6 since prelist_remove()
>> is found in netinet6/nd6_rtr.c.
>>
>> Several times I looked into ND code and found lots of race prone code there.
>> May be some was recently fixed by bz@, but definitely not merged to stable/8.
> 
> Thanks a lot guys. For now, I disabled IPv6 on this BRAS. Let's see if it's
> going to help.

Hi,
Any changes in stability ?

---Mike

-- 
---
Mike Tancsa, tel +1 519 651 3400
Sentex Communications, m...@sentex.net
Providing Internet services since 1994 www.sentex.net
Cambridge, Ontario Canada   http://www.tancsa.com/
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: mpd5/Netgraph issues after upgrading to 7.4

2012-06-18 Thread Mike Tancsa
On 6/18/2012 6:51 PM, Adrian Chadd wrote:
> Hi,
> 
> Is it possible to get you to setup a test BRAS running 9-STABLE, so
> you can provide feedback about how stable ipv4/ipv6 PPPoE is for you?

I have another LNS to deploy soon and I can enable IPv6 and use RELENG9.
I have in the past been able to trigger the panic after a few days of
use with IPv6 enabled.  Should have it up and running in a week or so.

    ---Mike
-- 
---
Mike Tancsa, tel +1 519 651 3400
Sentex Communications, m...@sentex.net
Providing Internet services since 1994 www.sentex.net
Cambridge, Ontario Canada   http://www.tancsa.com/
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: mpd5/Netgraph issues after upgrading to 7.4

2012-07-15 Thread Mike Tancsa
On 7/10/2012 2:24 AM, Przemyslaw Frasunek wrote:
>> It seems, Przemyslaw Frasunek uses proxyarp?
>> I have no such problems but I do not use proxyarp.
>> Could you get rid of it, Przemyslaw?
> 
> No, I don't use proxy ARP. I have about 300 PPPoE ng interfaces and 10 VLANs
> with plain IP traffic. ARP table has only < 50 entries, all of them are 
> dynamic.

I had a new one. Unfortunately, it did not generate a coredump file for
some reason.  Kernel was from ~ mid Feb

Fatal trap 9: general protection fault while in kernel mode
cpuid = 1; apic id = 02
instruction pointer = 0x20:0x80496e59
stack pointer   = 0x28:0xff8863b0
frame pointer   = 0x28:0xff8863c0
code segment= base 0x0, limit 0xf, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags= interrupt enabled, resume, IOPL = 0
current process = 13 (ng_queue1)
trap number = 9
panic: general protection fault
cpuid = 1
KDB: stack backtrace:
#0 0x803f922e at kdb_backtrace+0x5e
#1 0x803c6437 at panic+0x187
#2 0x80646e30 at trap_fatal+0x290
#3 0x8064736a at trap+0x10a
#4 0x8062eb94 at calltrap+0x8
#5 0x8049835b at ng_l2tp_rcvdata_lower+0x42b
#6 0x8048f380 at ng_apply_item+0x420
#7 0x80491790 at ng_snd_item+0x3f0
#8 0x804962ba at ng_ksocket_incoming2+0x24a
#9 0x8048f50d at ng_apply_item+0x5ad
#10 0x804912be at ngthread+0x22e
#11 0x8039b08f at fork_exit+0x11f
#12 0x8062f0de at fork_trampoline+0xe
Uptime: 151d12h10m10s
ipfw: 11 Deny TCP 192.168.1.99:33822 208.47.254.32:80 in via ng471
Dumping 883 out of 8145 MB:ipfw: 13 Deny UDP 64.7.157.21:512
192.168.254.46:137 in via ng21
panic: bufwrite: buffer is not busy???
cpuid = 1
Uptime: 151d12h10m10s





-- 
---
Mike Tancsa, tel +1 519 651 3400
Sentex Communications, m...@sentex.net
Providing Internet services since 1994 www.sentex.net
Cambridge, Ontario Canada   http://www.tancsa.com/
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


[patch] if_bxe shutdown fix

2012-09-04 Thread Mike Silbersack
Does anyone want to review this patch before I check it in?  The change 
has been reviewed and tested by coworkers, but not yet reviewed by any 
other FreeBSD committers.


http://www.silby.com/patches/if_bxe.c-safestop.patch

This resolves an issue we saw at work where IPMI would report bus errors 
when you rebooted a system with bxe NICs if you had not UP'd all of the 
bxe NICs before the shutdown.


Thanks,

Mike "Silby" Silbersack
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: [patch] if_bxe shutdown fix

2012-09-04 Thread Mike Silbersack

On 9/5/12 3:56 PM, YongHyeon PYUN wrote:

On Tue, Sep 04, 2012 at 11:35:13PM -0500, Mike Silbersack wrote:

Does anyone want to review this patch before I check it in?  The change
has been reviewed and tested by coworkers, but not yet reviewed by any
other FreeBSD committers.

http://www.silby.com/patches/if_bxe.c-safestop.patch

This resolves an issue we saw at work where IPMI would report bus errors
when you rebooted a system with bxe NICs if you had not UP'd all of the
bxe NICs before the shutdown.

Yeah I also have a similar patch. But I checked sc->state after
getting a BXE_CORE_LOCK as the state is protected by the lock.


Thanks,

Mike "Silby" Silbersack


Good catch.  How does this look?

http://www.silby.com/patches/if_bxe.c-safestop-2.patch

Mike "Silby" Silbersack
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: getting counters for a plenty of vlan ifaces

2012-09-16 Thread Mike Tancsa
On 9/16/2012 10:41 AM, Ivan Alexandrovich wrote:
> Hi
> 
> We are running freebsd9.0 on a router with
> more than 1000 of subscriber's vlan interfaces.
> Outgoing packet rate is approximately 40 kpps.
> 
> There's a need to collect bytes and packets
> counters for all those vlan interfaces every
> minute (or even twice a minute) and store them

Hi,
We approach it a little differently and collect all the data via
netflow, or in this case argus.  I sample the parent interface and save
all the flow data which argus is smart enough to parse out at the vlan
level.  You can then run all sorts of fine grained reports this way.  We
use it on a system with about 900 ng interfaces.

    ---Mike


-- 
---
Mike Tancsa, tel +1 519 651 3400
Sentex Communications, m...@sentex.net
Providing Internet services since 1994 www.sentex.net
Cambridge, Ontario Canada   http://www.tancsa.com/
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: 9.1-RC3 IGB dropping connections.

2012-11-27 Thread Mike Tancsa
On 11/27/2012 5:27 PM, Zaphod Beeblebrox wrote:
> I've got an Intel server motherboard with 4x igb (and 1x em) on it.
> The motherboard in question is the S3420GPRX and the IGB's show up as:
> 
> igb0:  port
> 0x3020-0x303f mem 0xb1b2-0xb1b3,0xb1bc4000-0xb1bc7fff irq 19
> at device 0.0 on pci3
> igb0: Using MSIX interrupts with 9 vectors
> igb0: Ethernet address: 00:1e:67:3a:d5:40
> igb0: Bound queue 0 to cpu 0
> 
> ... now... I have this machine (right now) on the local lan with my
> windows 7 workstation and putty sees the ssh connection as dropped
> often.  I say often --- in that it can happen in a minute or two... it
> often seems to happen when there is active output going to the window
> (like a download counter running), but I also say "often" in that...
> it seems slightly random... but it _is_ incessant... as in very
> "often."
> 

Are you using pf ? Also, did you confirm it is the igb nic and not
something more general ? e.g. if you put in a different nic, does the
problem go away ?

If you are using pf, lets see the rules.

---Mike



-- 
---
Mike Tancsa, tel +1 519 651 3400
Sentex Communications, m...@sentex.net
Providing Internet services since 1994 www.sentex.net
Cambridge, Ontario Canada   http://www.tancsa.com/
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: Review request: fix return value of socket(2) on no family found

2012-12-06 Thread Mike Karels
> On Thu, Dec 06, 2012 at 02:39:11PM +0800, Kevin Lo wrote:
> K> Here's the patch mostly from NetBSD to make socket(2) return EAFNOSUPPORT
> K> rather than EPROTONOSUPPORT if the family cannot be found.
> K> 
> K> http://people.freebsd.org/~kevlo/patch-socket
> K> 
> K> The man page documents the behavior specified in POSIX.1-2008:
> K> 
> K> http://pubs.opengroup.org/onlinepubs/9699919799/functions/socket.html
> K> 
> K> For reference, Linux, NetBSD, and OS X return EAFNOSUPPORT for this.

> IMO, the proposed change is correct.

I'd have to disagree.  EAFNOSUPPORT means "Address family not supported by
protocol family".  However, the socket syscall does not take an address
family parameter.  It takes a protocol family, a socket type, and an
optional protocol.  EPFNOSUPPORT would be the correct error if the protocol
family is not supported.  I don't remember if I missed this when POSIX
was being balloted, or if my objection was unsuccessful.

That said, I will say that consistency across systems and with the standard
is a useful thing, so I'll reluctantly agree with the change to the errno.

However, the proposed text for socket(2) doesn't make sense:

+The address family (domain) is not supported or the
+specified domain is not supported by this protocol family.

The domain is the protocol family.  This could reasonably say just
"The protocol family (domain) is not supported."  It might further
say "This specific error value may not be accurate, but is specified
by POSIX.1-2008."

Mike
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: 'no buffer space available' after switch goes down on freeBSD 7.3

2012-12-24 Thread Mike Karels
> On 24 December 2012 17:01, Ryan Stone  wrote:
> > I don't believe that this is fixed in later versions of the driver. The
> > problem is that when the interface loses link the transmit queue can fill
> > up. Once that happens the driver never gets any more calls from the network
> > stack to make it send packets. Pinging the interface fixes it because the
> > driver processes rx.and tx from the same context, so when it receives a
> > packet it starts transmitting again.
> >
> > The patch that I sent fixes the problem by forcing the driver to process
> > the tx queue when ever links goes from down to up.

> This is a cute fix, and I've noticed similar issues in net80211.

> In net80211, the stack currently calls if_start() to re-attempt frame
> transmission during a VAP state transition to RUN.
> This has similar issues (ie, it assumes that if_start() DTRT; it
> assumes OACTIVE has been cleared, etc.)

> I think we may need another if_* method which specifically attempts to
> service the TX queue again; versus just waiting for if_transmit() to
> make some progress.

In my opinion, it is wrong of the drivers to queue packets while link
is down.  The packets are delayed indefinitely, and are useless at best.
In my company's product (McAfee firewall), we had problems with state-sharing
packets that were way out of date in a cluster.  We changed the drivers to
empty the queue and discard subsequent packets when link was down.  No
special change is needed to restart: the next time a packet is transmitted
after link comes up, that packet is sent.  Our change is not necessarily
done the way I'd do it for FreeBSD, but it minimizes changes.  Patch
available on request.

Mike
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: Debugging em(4) driver

2010-11-13 Thread Mike Tancsa
On 11/13/2010 9:35 PM, Patrick Mahan wrote:
> 
> 
> On 11/13/2010 02:27 PM, Ryan Stone wrote:
>> It looks to me that you're getting a ton of input drops.  That's
>> presumably the cause of your issue.  You can get the em driver to
>> print debug information to the console by running:
>>
>> # sysctl dev.em.3.stats=1
>> # sysctl.dev.em.3.debug=1
>>
>> The output should be available in dmesg and /var/log/messages
>>
>> Hopefully that can shed some light on the nature of the drops.
> 
> Ryan,
> 
> Thanks for the tip.  But I see I forgot to mention this was FreeBSD 8.0.
> The em(4) driver is actually the one found in FreeBSD 8.1 as we needed
> the AltQ fixes.
> 

Try grabbing the em drivers from HEAD.  They fixed a few bugs for me.
You should be able to grab all of /usr/src/sys/dev/e1000 from HEAD and
drop it in to RELENG_8 and compile

---Mike
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: em driver, 82574L chip, and possibly ASPM

2010-11-23 Thread Mike Tancsa
On 11/23/2010 7:47 AM, Ivan Voras wrote:
> It looks like I'm unfortunate enough to have to deploy on a machine
> which has the 82574L Intel NIC chip on a Supermicro X8SIE-F board, which
> apparently has hardware issues, according to this thread:
> 
> http://sourceforge.net/tracker/index.php?func=detail&aid=2908463&group_id=42302&atid=447449
> 
> 

Interesting, this is the same nic that has been giving me grief! Mine is
on an Intel server board (S3420GPX). The symptoms are VERY similar to
what the LINUX user sees as well with RX errors and the traffic patterns.

---Mike


> One of the proposed workarounds is disabling "Active State Power
> Management" in the BIOS and in the OS.
> 
> I have disabled it in BIOS but I don't know how to disable it in FreeBSD
> (apparently only disabling it in BIOS isn't enough).
> 
> Any ideas on how to achieve the effect in FreeBSD?
> 
> ___
> freebsd-net@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
> 
> 

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: em driver, 82574L chip, and possibly ASPM

2010-11-23 Thread Mike Tancsa
On 11/23/2010 8:16 AM, Ivan Voras wrote:
> On 11/23/10 14:03, Mike Tancsa wrote:
>> On 11/23/2010 7:47 AM, Ivan Voras wrote:
>>> It looks like I'm unfortunate enough to have to deploy on a machine
>>> which has the 82574L Intel NIC chip on a Supermicro X8SIE-F board, which
>>> apparently has hardware issues, according to this thread:
>>>
>>> http://sourceforge.net/tracker/index.php?func=detail&aid=2908463&group_id=42302&atid=447449
>>>
>>>
>>>
>>
>> Interesting, this is the same nic that has been giving me grief! Mine is
>> on an Intel server board (S3420GPX). The symptoms are VERY similar to
>> what the LINUX user sees as well with RX errors and the traffic patterns.
> 
> I've posted detailed info on this NIC in the thread "em card wedging" -
> can you compare it with yours?
> 
> The whole thing looks very sensitive to BIOS settings. I've just toggled
> something that looked unrelated (don't remember what, I've been toggling
> BIOS settings all day) and the machine has been doing a flood-ping for
> 20 minutes without wedging (which doesn't mean it won't wedge as soon as
> I send this message, it did such things before).


I posted whats in the BIOS at

http://www.tancsa.com/82574.html

Unfortunately, if I disable the BIOS option highlighted I can no longer
netboot the box :(  For my production box having the issues, this is not
a problem.  But it makes it difficult for testing on my lab box.  I am
not sure if that even really disables IPMI ?  Also on this box whats
NIC1 and NIC2 is the opposite of what FreeBSD sees as em0 and em1.

So far I have tried

Driver from HEAD -- This seems to help a bit in that wedges are less
disable MSIX - no difference, still hangs

It seems the nic will get one error and never recover. There will just
be a steady stream of them.  On the other onboard nic (a different type
of em), the card will see the odd "no_buff" error, but it recovers like
all the other em nics. Where as this problem nic, gets errors and they
just keep on going up and up. Using the driver from HEAD, I can do an
ifconfig em1 down;sleep 1;ifconfig em1 up and that fixes the problem

dev.em.1.mac_stats.missed_packets: 1292
dev.em.1.mac_stats.recv_no_buff: 31

where as previous versions of the driver would panic the box doing that.

Looking at the driver from HEAD, there does seem to be some mention of
ASPM. Is this what the LINUX driver is doing too ?



   /* PCI-Ex Control Registers */
switch (hw->mac.type) {
case e1000_82574:
case e1000_82583:
reg = E1000_READ_REG(hw, E1000_GCR);
reg |= (1 << 22);
E1000_WRITE_REG(hw, E1000_GCR, reg);

/*
 * Workaround for hardware errata.
 * apply workaround for hardware errata documented in errata
 * docs Fixes issue where some error prone or unreliable
PCIe
 * completions are occurring, particularly with ASPM
enabled.
 * Without fix, issue can cause tx timeouts.
 */
reg = E1000_READ_REG(hw, E1000_GCR2);
reg |= 1;
E1000_WRITE_REG(hw, E1000_GCR2, reg);
break;
default:
break;
}

return;




---Mike
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: em driver, 82574L chip, and possibly ASPM

2010-11-23 Thread Mike Tancsa
On 11/23/2010 12:39 PM, Sean Bruno wrote:
> On Tue, 2010-11-23 at 04:47 -0800, Ivan Voras wrote:
>> It looks like I'm unfortunate enough to have to deploy on a machine 
>> which has the 82574L Intel NIC chip on a Supermicro X8SIE-F board, which 
> i...@pci0:5:0:0:class=0x02 card=0x8975152d chip=0x10c98086

Strange, the 82574 attaches as em for me, not igb

e...@pci0:10:0:0:class=0x02 card=0x34ec8086 chip=0x10d38086
rev=0x00 hdr=0x00
vendor = 'Intel Corporation'
device = 'Intel 82574L Gigabit Ethernet Controller (82574L)'
class  = network
subclass   = ethernet
cap 01[c8] = powerspec 2  supports D0 D3  current D0
cap 05[d0] = MSI supports 1 message, 64 bit enabled with 1 message
cap 10[e0] = PCI-Express 1 endpoint max data 128(256) link x1(x1)
cap 11[a0] = MSI-X supports 5 messages in map 0x1c
ecap 0001[100] = AER 1 0 fatal 0 non-fatal 0 corrected
ecap 0003[140] = Serial 1 001517ed68a4

Normally, its msix, but I had disabled that hoping it would fix the problem

em1:  port 0x2000-0x201f mem
0xb410-0xb411,0xb412-0xb4123fff irq 16 at dev
ice 0.0 on pci10
em1: Using an MSI interrupt
em1: [FILTER]
em1: Ethernet address: 00:15:17:ed:68:a4


---Mike
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: Problem with igb(4) updated to version 2.0.7

2010-12-03 Thread Mike Tancsa
On 12/3/2010 1:44 PM, Eugene Grosbein wrote:
> On 03.12.2010 23:49, Jack Vogel wrote:
>> It has never been the case that 'down'ing an interface brings link down,
>> not on em
>> or igb. So this isn't problem with the release.
>>
>> Jack
> 
> Now I see, thanks.
> 
> Is it technically possible to bring link down
> for distinct port of dual-port em/igb-supported NICs using software?
> 
> If yes, I'd like to patch my source tree.
> For EtherChannel this kind of management should be possible.


If your switch port's speed and duplex are manual, change the media
options on the NIC to something like 10 half. The switch should see the
port "down" then.

---Mike

> 
> Eugene Grosbein
> ___
> freebsd-net@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
> 
> 

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: vlan limits on e1000?

2010-12-07 Thread Mike Tancsa
On 12/6/2010 8:18 PM, Mihai-Catalin Salgau wrote:
> Hello Freebsd-net,
> 
>   I have two dual port NICs, one Broadcom(bce0,bce1) and one Intel(em0,em1), 
> on FreeBSD 8-stable
>   (about two weeks old) with a DHCP server running.

Hi,
There were a bunch of changes to RELENG_8's em driver a week ago.
Perhaps update to that first.  But what sort of em nics do you have ?
pciconf -lvc will show it.  I have a number of boxes with 20 or more

 ifconfig | grep ^vlan | wc
  20 1201562

Most of which are pcie based, or onboard 82574L types.

---Mike

>   I've been successfully using a large number of vlans over bce1,em0 and em1 
> with iSCSI,
>   but wanted to switch to AoE(ata over ethernet). I've set vlandevs by 
> round-robin, and got
>   vlan1 on bce0, vlan2 on em0, vlan3 on em1, vlan4 on bce0vlan12 on em1. 
> I've binded
>   net/vblade instances to each interface, but the problem I'm facing now is 
> that while
>   vlans 1-10 are working properly, vlans 11 and 12 won't see any traffic 
> unless the interface is
>   in promiscuous mode. I noticed that while trying to attach tcpdump and saw 
> the thing instantly work.
>   I've had no problems with iSCSI over the same setup, and dhcp packets are 
> getting trough properly.
>   I've moved those last two vlans to bce0 and they work ok, but I'm a bit 
> locked on why this is happening.
>   Are there any known limitations on vlans on e1000?
>   
> 
> ___
> freebsd-net@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
> 
> 

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: vlan limits on e1000?

2010-12-07 Thread Mike Tancsa
On 12/7/2010 6:45 PM, Mihai-Catalin Salgau wrote:
> Hello Mike,
> 
> Tuesday, December 7, 2010, 8:13:26 PM, you wrote:
> 
>> Hi,
>> There were a bunch of changes to RELENG_8's em driver a week ago.
>> Perhaps update to that first.  But what sort of em nics do you have ?
>> pciconf -lvc will show it.  I have a number of boxes with 20 or more
> 
>>  ifconfig | grep ^vlan | wc
>>   20 1201562
> 
>> Most of which are pcie based, or onboard 82574L types.
> 
> The Intel card yields: Intel Corporation HP NC360T PCIe DP Gigabit Server 
> Adapter (n1e5132)
> Done updating and still the same thing:(

Strange, I have that card as well.


Strange, I have many of those and they work fine and it works fine for
me.  How are you creating the vlans ?  ie. what is the syntax



e...@pci0:1:0:1: class=0x02 card=0x115e8086 chip=0x105e8086 rev=0x06
hdr=0x00
vendor = 'Intel Corporation'
device = 'HP NC360T PCIe DP Gigabit Server Adapter (n1e5132)'
class  = network
subclass   = ethernet


% ifconfig | grep em1
em1: flags=8843 metric 0 mtu 1500
vlan: 5 parent interface: em1
vlan: 770 parent interface: em1
vlan: 71 parent interface: em1
vlan: 73 parent interface: em1
vlan: 77 parent interface: em1
vlan: 155 parent interface: em1
vlan: 260 parent interface: em1
vlan: 107 parent interface: em1
vlan: 352 parent interface: em1
vlan: 522 parent interface: em1
vlan: 501 parent interface: em1
vlan: 543 parent interface: em1
vlan: 145 parent interface: em1
vlan: 250 parent interface: em1
vlan: 262 parent interface: em1
vlan: 872 parent interface: em1
vlan: 390 parent interface: em1
vlan: 58 parent interface: em1
vlan: 63 parent interface: em1
vlan: 740 parent interface: em1
vlan: 603 parent interface: em1
vlan: 1123 parent interface: em1
vlan: 502 parent interface: em1
vlan: 504 parent interface: em1
vlan: 1125 parent interface: em1

---Mike
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: em driver, 82574L chip, and possibly ASPM

2010-12-24 Thread Mike Tancsa
On 12/24/2010 5:44 PM, Jan Koum wrote:
> hi Ivan and Mike,
> 
> wanted to follow up and see if you found a solid long-term solution to this
> bug. we are still seeing this problem in our 8.2 environment with ASPM
> already disabled.  here is what we have:

Hmmm,
With the latest version of the driver in RELENG_8 (its the same as in
HEAD) I havent seen the problem.  However, I would only see it once per
week prior to that.  The odd thing is that it would happen during a
slightly lower than normal backup load, but almost always at the same
time (early sunday AM).  Not sure what would trigger it exactly.  If it
happened again, I was going to enable port mirroring on the switchport
and capture the traffic, hoping some "special" pattern would enable the
issue.

Do you have IPMI enabled on the NIC ? I tried to turn it off on my MB,
but there is no clear way to do this. It 'seems' to be off, but not sure
if it really is.  One thing I noticed was that when the NIC was hung, it
still was able to receive and process IPMI commands from an external host.

---Mike

> 
> 1. motherboard is SuperMicro X8SIE-LN4F Intel Xeon:
> 
> e...@pci0:3:0:0: class=0x02 card=0x040d15d9 chip=0x10d38086 rev=0x00
> hdr=0x00
> vendor = 'Intel Corporation'
> device = 'Intel 82574L Gigabit Ethernet Controller (82574L)'
> class  = network
> subclass   = ethernet
> e...@pci0:4:0:0: class=0x02 card=0x040d15d9 chip=0x10d38086 rev=0x00
> hdr=0x00
> vendor = 'Intel Corporation'
> device = 'Intel 82574L Gigabit Ethernet Controller (82574L)'
> class  = network
> subclass   = ethernet
> e...@pci0:5:0:0: class=0x02 card=0x040d15d9 chip=0x10d38086 rev=0x00
> hdr=0x00
> vendor = 'Intel Corporation'
> device = 'Intel 82574L Gigabit Ethernet Controller (82574L)'
> class  = network
> subclass   = ethernet
> e...@pci0:6:0:0: class=0x02 card=0x040d15d9 chip=0x10d38086 rev=0x00
> hdr=0x00
> vendor = 'Intel Corporation'
> device = 'Intel 82574L Gigabit Ethernet Controller (82574L)'
> class  = network
> subclass   = ethernet
> 
> 2. ASPM is already disabled in the BIOS
> 
> 3. when em1 interface locks up, sysctl debug says:
> 
> Interface is NOT RUNNING
> and INACTIVE
> em1: hw tdh = 0, hw tdt = 0
> em1: hw rdh = 0, hw rdt = 0
> em1: Tx Queue Status = 0
> em1: TX descriptors avail = 110
> em1: Tx Descriptors avail failure = 319
> em1: RX discarded packets = 0
> em1: RX Next to Check = 80
> em1: RX Next to Refresh = 80
> 
> 4. doing "ifconfig em1 down; sleep1; ifconfig em1 up" resolves the issue and
> removes OACTIVE flag from em1.
> 
> 5. we are running 8.2-PRERELEASE from December 19th:
> % grep '$FreeBSD' /usr/src/sys/dev/e1000/if_em.c
> /*$FreeBSD: src/sys/dev/e1000/if_em.c,v 1.21.2.18 2010/12/14 19:59:39 jfv
> Exp $*/
> 
> dmesg output is:
> 
> em1:  port 0xcc00-0xcc1f mem
> 0xfb4e-0xfb4f,0xfb4dc000-0xfb4d irq 17 at device 0.0 on pci4
> em1: Reserved 0x2 bytes for rid 0x10 type 3 at 0xfb4e
> em1: Reserved 0x4000 bytes for rid 0x1c type 3 at 0xfb4dc000
> em1: attempting to allocate 3 MSI-X vectors (5 supported)
> msi: routing MSI-X IRQ 259 to local APIC 0 vector 53
> msi: routing MSI-X IRQ 260 to local APIC 0 vector 54
> msi: routing MSI-X IRQ 261 to local APIC 0 vector 55
> em1: using IRQs 259-261 for MSI-X
> em1: Using MSIX interrupts with 3 vectors
> em1: [MPSAFE]
> em1: [ITHREAD]
> em1: [MPSAFE]
> em1: [ITHREAD]
> em1: [MPSAFE]
> em1: [ITHREAD]
> em1: bpf attached
> em1: Ethernet address: 00:25:90:0e:25:e9
> 
> aside from running cronjob every minute to check for dead interface and
> reset it, is there anything else we can try?
> 
> thanks.
> 
> 
> On Tue, Nov 23, 2010 at 10:36 AM, Jack Vogel  wrote:
> 
>> 82574 is supposed to be em, not igb :)  Its always had this kind of
>> 'in-between'
>> status, it was targeted as a 'client' or consumer part, but it has MSIX
>> which
>> make it almost like 8257[56].
>>
>> Mike, there are some further 82574 changes to shared code that I'm looking
>> into today.
>>
>> Jack
>>
>>
>> On Tue, Nov 23, 2010 at 10:17 AM, Mike Tancsa  wrote:
>>
>>> On 11/23/2010 12:39 PM, Sean Bruno wrote:
>>>> On Tue, 2010-11-23 at 04:47 -0800, Ivan Voras wrote:
>>>>> It looks like I'm unfortunate enough to have to deploy on a machine
>>>>> which has the 82574L Intel NIC chip on a Supermicro X8SIE-F board,
>> which
>>>> i...@pci0

  1   2   3   4   5   6   7   8   9   10   >