Re: ix(intel) vs mlxen(mellanox) 10Gb performance

2015-08-18 Thread Daniel Braniss

> On Aug 18, 2015, at 12:49 AM, Rick Macklem  wrote:
> 
> Daniel Braniss wrote:
>> 
>>> On Aug 17, 2015, at 3:21 PM, Rick Macklem  wrote:
>>> 
>>> Daniel Braniss wrote:
 
> On Aug 17, 2015, at 1:41 PM, Christopher Forgeron 
> wrote:
> 
> FYI, I can regularly hit 9.3 Gib/s with my Intel X520-DA2's and FreeBSD
> 10.1. Before 10.1 it was less.
> 
 
 this is NOT iperf/3 where i do get close to wire speed,
 it’s NFS writes, i.e., almost real work :-)
 
> I used to tweak the card settings, but now it's just stock. You may want
> to
> check your settings, the Mellanox may just have better defaults for your
> switch.
> 
>>> Have you tried disabling TSO for the Intel? With TSO enabled, it will be
>>> copying
>>> every transmitted mbuf chain to a new chain of mbuf clusters via.
>>> m_defrag() when
>>> TSO is enabled. (Assuming you aren't an 82598 chip. Most seem to be the
>>> 82599 chip
>>> these days?)
>>> 
>> 
>> hi Rick
>> 
>> how can i check the chip?
>> 
> Haven't a clue. Does "dmesg" tell you? (To be honest, since disabling TSO 
> helped,
> I'll bet you don't have a 82598.)
> 
>>> This has been fixed in the driver very recently, but those fixes won't be
>>> in 10.1.
>>> 
>>> rick
>>> ps: If you could test with 10.2, it would be interesting to see how the ix
>>> does with
>>>   the current driver fixes in it?
>> 
>> I new TSO was involved!
>> ok, firstly, it’s 10.2 stable.
>> with TSO enabled, ix is bad, around 64MGB/s.
>> disabling TSO it’s better, around 130
>> 
> Hmm, could you check to see of these lines are in sys/dev/ixgbe/if_ix.c at 
> around
> line#2500?
>  /* TSO parameters */
> 2572   ifp->if_hw_tsomax = 65518;
> 2573   ifp->if_hw_tsomaxsegcount = IXGBE_82599_SCATTER;
> 2574   ifp->if_hw_tsomaxsegsize = 2048;
> 
> They are in stable/10. I didn't look at releng/10.2. (And if they're in a 
> #ifdef
> for FreeBSD11, take the #ifdef away.)
> If they are there and not ifdef'd, I can't explain why disabling TSO would 
> help.
> Once TSO is fixed so that it handles the 64K transmit segments without 
> copying all
> the mbufs, I suspect you might get better perf. with it enabled?
> 

this is 10.2 :
they are on lines  2509-2511 and I don’t see any #ifdefs around it.

the plot thickens :-)

danny

> Good luck with it, rick
> 
>> still, mlxen0 is about 250! with and without TSO
>> 
>> 
>>> 
> On Mon, Aug 17, 2015 at 6:41 AM, Slawa Olhovchenkov  > wrote:
> On Mon, Aug 17, 2015 at 10:27:41AM +0300, Daniel Braniss wrote:
> 
>> hi,
>> I have a host (Dell R730) with both cards, connected to an HP8200
>> switch at 10Gb.
>> when writing to the same storage (netapp) this is what I get:
>> ix0:~130MGB/s
>> mlxen0  ~330MGB/s
>> this is via nfs/tcpv3
>> 
>> I can get similar (bad) performance with the mellanox if I increase
>> the file size
>> to 512MGB.
> 
> Look like mellanox have internal beffer for caching and do ACK
> acclerating.
> 
>> so at face value, it seems the mlxen does a better use of resources
>> than the intel.
>> Any ideas how to improve ix/intel's performance?
> 
> Are you sure about netapp performance?
> ___
> freebsd-net@freebsd.org  mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-net
> 
> To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org
> "
> 
 
 ___
 freebsd-sta...@freebsd.org mailing list
 https://lists.freebsd.org/mailman/listinfo/freebsd-stable
 To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
>> 
>> ___
>> freebsd-sta...@freebsd.org mailing list
>> https://lists.freebsd.org/mailman/listinfo/freebsd-stable
>> To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

pf and new interface

2015-08-18 Thread Andriy Gapon

I have the following rule in pf.conf:
set skip on tap
and even the following one:
set skip on tap0

The rules are loaded at the system start-up time, but the tap interface
may not be created until much later.  When tap0 is first created the
skip rules are not applied to it and the traffic gets filtered.  If I
reload the pf configuration, then the rules start working.

Is there a way to make pf honor such rules for the dynamic interfaces?

-- 
Andriy Gapon
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: pf and new interface

2015-08-18 Thread wishmaster

  

 --- Original message ---
From: "Andriy Gapon" 
Date: 18 August 2015, 14:05:15

 
> I have the following rule in pf.conf:
> set skip on tap
> and even the following one:
> set skip on tap0
> 
> The rules are loaded at the system start-up time, but the tap interface
> may not be created until much later.  When tap0 is first created the
> skip rules are not applied to it and the traffic gets filtered.  If I
> reload the pf configuration, then the rules start working.
> 
> Is there a way to make pf honor such rules for the dynamic interfaces?Hi,

You should do it in your application, e.g. in mpd this is something like below

        set iface up-script /usr/local/etc/mpd5/link_up.sh
        set iface down-script /usr/local/etc/mpd5/link_down.sh

in openvpn - see manuals.

Cheers,
Vitaliy
 
 
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: pf and new interface

2015-08-18 Thread Andriy Gapon
On 18/08/2015 14:18, wishmaster wrote:
>  --- Original message ---
> From: "Andriy Gapon" 
> Date: 18 August 2015, 14:05:15
> 
>  
>> I have the following rule in pf.conf:
>> set skip on tap
>> and even the following one:
>> set skip on tap0
>>
>> The rules are loaded at the system start-up time, but the tap interface
>> may not be created until much later.  When tap0 is first created the
>> skip rules are not applied to it and the traffic gets filtered.  If I
>> reload the pf configuration, then the rules start working.
>>
>> Is there a way to make pf honor such rules for the dynamic interfaces?Hi,
> 
> You should do it in your application, e.g. in mpd this is something like below
> 
> set iface up-script /usr/local/etc/mpd5/link_up.sh
> set iface down-script /usr/local/etc/mpd5/link_down.sh
> 
> in openvpn - see manuals.

That's a good suggestion.  But how to add a single rule for pf?
Reloading the whole configuration is disruptive to existing connections.

-- 
Andriy Gapon
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re[2]: pf and new interface

2015-08-18 Thread wishmaster


 
 --- Original message ---
 From: "Andriy Gapon" 
 Date: 18 August 2015, 14:35:36
  


> On 18/08/2015 14:18, wishmaster wrote:
> > --- Original message ---
> > From: "Andriy Gapon" 
> > Date: 18 August 2015, 14:05:15
> > 
> > 
> >> I have the following rule in pf.conf:
> >> set skip on tap
> >> and even the following one:
> >> set skip on tap0
> >>
> >> The rules are loaded at the system start-up time, but the tap interface
> >> may not be created until much later. When tap0 is first created the
> >> skip rules are not applied to it and the traffic gets filtered. If I
> >> reload the pf configuration, then the rules start working.
> >>
> >> Is there a way to make pf honor such rules for the dynamic interfaces?Hi,
> > 
> > You should do it in your application, e.g. in mpd this is something like 
> > below
> > 
> > set iface up-script /usr/local/etc/mpd5/link_up.sh
> > set iface down-script /usr/local/etc/mpd5/link_down.sh
> > 
> > in openvpn - see manuals.
> 
> That's a good suggestion. But how to add a single rule for pf?
> Reloading the whole configuration is disruptive to existing connections.


Use anchors.
Small example:

# VPN Interface Up Script
#
# Script is called like this:
#
#   script  interface proto local-ip remote-ip authname
#   $1  $2$3$4$5
#

anchor "ng-int/*"

# less if-up.sh
#!/bin/sh
echo "pass quick on $1 all" | pfctl -a ng-int/$1 -f -

# less if-down.sh
#!/bin/sh
pfctl -a ng-int/$1 -F rules

 
 
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: pf and new interface

2015-08-18 Thread Andriy Gapon
On 18/08/2015 14:55, wishmaster wrote:
>  --- Original message ---
>  From: "Andriy Gapon" 
>  Date: 18 August 2015, 14:35:36
>   
> 
> 
>> On 18/08/2015 14:18, wishmaster wrote:
>>> --- Original message ---
>>> From: "Andriy Gapon" 
>>> Date: 18 August 2015, 14:05:15
>>>
>>>
 I have the following rule in pf.conf:
 set skip on tap
 and even the following one:
 set skip on tap0

 The rules are loaded at the system start-up time, but the tap interface
 may not be created until much later. When tap0 is first created the
 skip rules are not applied to it and the traffic gets filtered. If I
 reload the pf configuration, then the rules start working.

 Is there a way to make pf honor such rules for the dynamic interfaces?Hi,
>>>
>>> You should do it in your application, e.g. in mpd this is something like 
>>> below
>>>
>>> set iface up-script /usr/local/etc/mpd5/link_up.sh
>>> set iface down-script /usr/local/etc/mpd5/link_down.sh
>>>
>>> in openvpn - see manuals.
>>
>> That's a good suggestion. But how to add a single rule for pf?
>> Reloading the whole configuration is disruptive to existing connections.
> 
> 
> Use anchors.

Thank you for the hint!

> Small example:
> 
> # VPN Interface Up Script
> #
> # Script is called like this:
> #
> #   script  interface proto local-ip remote-ip authname
> #   $1  $2$3$4$5
> #
> 
> anchor "ng-int/*"
> 
> # less if-up.sh
> #!/bin/sh
> echo "pass quick on $1 all" | pfctl -a ng-int/$1 -f -
> 
> # less if-down.sh
> #!/bin/sh
> pfctl -a ng-int/$1 -F rules
> 
>  
>  
> 


-- 
Andriy Gapon
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: ix(intel) vs mlxen(mellanox) 10Gb performance

2015-08-18 Thread Rick Macklem
Daniel Braniss wrote:
> 
> > On Aug 18, 2015, at 12:49 AM, Rick Macklem  wrote:
> > 
> > Daniel Braniss wrote:
> >> 
> >>> On Aug 17, 2015, at 3:21 PM, Rick Macklem  wrote:
> >>> 
> >>> Daniel Braniss wrote:
>  
> > On Aug 17, 2015, at 1:41 PM, Christopher Forgeron
> > 
> > wrote:
> > 
> > FYI, I can regularly hit 9.3 Gib/s with my Intel X520-DA2's and FreeBSD
> > 10.1. Before 10.1 it was less.
> > 
>  
>  this is NOT iperf/3 where i do get close to wire speed,
>  it’s NFS writes, i.e., almost real work :-)
>  
> > I used to tweak the card settings, but now it's just stock. You may
> > want
> > to
> > check your settings, the Mellanox may just have better defaults for
> > your
> > switch.
> > 
> >>> Have you tried disabling TSO for the Intel? With TSO enabled, it will be
> >>> copying
> >>> every transmitted mbuf chain to a new chain of mbuf clusters via.
> >>> m_defrag() when
> >>> TSO is enabled. (Assuming you aren't an 82598 chip. Most seem to be the
> >>> 82599 chip
> >>> these days?)
> >>> 
> >> 
> >> hi Rick
> >> 
> >> how can i check the chip?
> >> 
> > Haven't a clue. Does "dmesg" tell you? (To be honest, since disabling TSO
> > helped,
> > I'll bet you don't have a 82598.)
> > 
> >>> This has been fixed in the driver very recently, but those fixes won't be
> >>> in 10.1.
> >>> 
> >>> rick
> >>> ps: If you could test with 10.2, it would be interesting to see how the
> >>> ix
> >>> does with
> >>>   the current driver fixes in it?
> >> 
> >> I new TSO was involved!
> >> ok, firstly, it’s 10.2 stable.
> >> with TSO enabled, ix is bad, around 64MGB/s.
> >> disabling TSO it’s better, around 130
> >> 
> > Hmm, could you check to see of these lines are in sys/dev/ixgbe/if_ix.c at
> > around
> > line#2500?
> >  /* TSO parameters */
> > 2572 ifp->if_hw_tsomax = 65518;
> > 2573 ifp->if_hw_tsomaxsegcount = 
> > IXGBE_82599_SCATTER;
> > 2574 ifp->if_hw_tsomaxsegsize = 2048;
> > 
> > They are in stable/10. I didn't look at releng/10.2. (And if they're in a
> > #ifdef
> > for FreeBSD11, take the #ifdef away.)
> > If they are there and not ifdef'd, I can't explain why disabling TSO would
> > help.
> > Once TSO is fixed so that it handles the 64K transmit segments without
> > copying all
> > the mbufs, I suspect you might get better perf. with it enabled?
> > 
> 
> this is 10.2 :
> they are on lines  2509-2511 and I don’t see any #ifdefs around it.
> 
> the plot thickens :-)
> 
If this is just a test machine, maybe you could test with these lines (at about 
#880)
in sys/netinet/tcp_output.c commented out? (It looks to me like this will 
disable TSO
for almost all the NFS writes.)
- around line #880 in sys/netinet/tcp_output.c:
/*
 * In case there are too many small fragments
 * don't use TSO:
 */
if (len <= max_len) {
len = max_len;
sendalot = 1;
tso = 0;
}

This was added along with the other stuff that did the if_hw_tsomaxsegcount, 
etc and I
never noticed it until now (not my patch).

rick

> danny
> 
> > Good luck with it, rick
> > 
> >> still, mlxen0 is about 250! with and without TSO
> >> 
> >> 
> >>> 
> > On Mon, Aug 17, 2015 at 6:41 AM, Slawa Olhovchenkov  > > wrote:
> > On Mon, Aug 17, 2015 at 10:27:41AM +0300, Daniel Braniss wrote:
> > 
> >> hi,
> >> I have a host (Dell R730) with both cards, connected to an HP8200
> >> switch at 10Gb.
> >> when writing to the same storage (netapp) this is what I get:
> >> ix0:~130MGB/s
> >> mlxen0  ~330MGB/s
> >> this is via nfs/tcpv3
> >> 
> >> I can get similar (bad) performance with the mellanox if I
> >> increase
> >> the file size
> >> to 512MGB.
> > 
> > Look like mellanox have internal beffer for caching and do ACK
> > acclerating.
> > 
> >> so at face value, it seems the mlxen does a better use of
> >> resources
> >> than the intel.
> >> Any ideas how to improve ix/intel's performance?
> > 
> > Are you sure about netapp performance?
> > ___
> > freebsd-net@freebsd.org  mailing list
> > https://lists.freebsd.org/mailman/listinfo/freebsd-net
> > 
> > To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org
> > "
> > 
>  
>  ___
>  freebsd-sta...@freebsd.org mailing list
>  https://lists.freebsd.org/mail

Re: ix(intel) vs mlxen(mellanox) 10Gb performance

2015-08-18 Thread Hans Petter Selasky

On 08/18/15 14:53, Rick Macklem wrote:

If this is just a test machine, maybe you could test with these lines (at about 
#880)
in sys/netinet/tcp_output.c commented out? (It looks to me like this will 
disable TSO
for almost all the NFS writes.)
- around line #880 in sys/netinet/tcp_output.c:
/*
 * In case there are too many small fragments
 * don't use TSO:
 */
if (len <= max_len) {
len = max_len;
sendalot = 1;
tso = 0;
}

This was added along with the other stuff that did the if_hw_tsomaxsegcount, 
etc and I
never noticed it until now (not my patch).


FYI:

These lines are needed by other hardware, like the mlxen driver. If you 
remove them mlxen will start doing m_defrag(). I believe if you set the 
correct parameters in the "struct ifnet" for the TSO size/count limits 
this problem will go away. If you print the "len" and "max_len" and also 
the cases where TSO limits are reached, you'll see what parameter is 
triggering it and needs to be increased.


--HPS
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: ix(intel) vs mlxen(mellanox) 10Gb performance

2015-08-18 Thread Hans Petter Selasky

On 08/18/15 14:53, Rick Macklem wrote:

2572 ifp->if_hw_tsomax = 65518;

>2573 ifp->if_hw_tsomaxsegcount = IXGBE_82599_SCATTER;
>2574 ifp->if_hw_tsomaxsegsize = 2048;


Hi,

If IXGBE_82599_SCATTER is the maximum scatter/gather entries the 
hardware can do, remember to subtract one fragment for the TCP/IP-header 
mbuf!


I think there is an off-by-one here:

ifp->if_hw_tsomax = 65518;
ifp->if_hw_tsomaxsegcount = IXGBE_82599_SCATTER - 1;
ifp->if_hw_tsomaxsegsize = 2048;

Refer to:


 *
 * NOTE: The TSO limits only apply to the data payload part of
 * a TCP/IP packet. That means there is no need to subtract
 * space for ethernet-, vlan-, IP- or TCP- headers from the
 * TSO limits unless the hardware driver in question requires
 * so.


In sys/net/if_var.h

Thank you!

--HPS

___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: ix(intel) vs mlxen(mellanox) 10Gb performance

2015-08-18 Thread Daniel Braniss
sorry, it’s been a tough day, we had a major meltdown, caused by a faulty gbic 
:-(
anyways, could you tell me what to do?
comment out, fix the off by one?

the machine is not yet production.

thanks,
danny

> On 18 Aug 2015, at 16:32, Hans Petter Selasky  wrote:
> 
> On 08/18/15 14:53, Rick Macklem wrote:
>> 2572  ifp->if_hw_tsomax = 65518;
>>> >2573ifp->if_hw_tsomaxsegcount = 
>>> >IXGBE_82599_SCATTER;
>>> >2574ifp->if_hw_tsomaxsegsize = 2048;
> 
> Hi,
> 
> If IXGBE_82599_SCATTER is the maximum scatter/gather entries the hardware can 
> do, remember to subtract one fragment for the TCP/IP-header mbuf!
> 
> I think there is an off-by-one here:
> 
> ifp->if_hw_tsomax = 65518;
> ifp->if_hw_tsomaxsegcount = IXGBE_82599_SCATTER - 1;
> ifp->if_hw_tsomaxsegsize = 2048;
> 
> Refer to:
> 
>> *
>> * NOTE: The TSO limits only apply to the data payload part of
>> * a TCP/IP packet. That means there is no need to subtract
>> * space for ethernet-, vlan-, IP- or TCP- headers from the
>> * TSO limits unless the hardware driver in question requires
>> * so.
> 
> In sys/net/if_var.h
> 
> Thank you!
> 
> --HPS
> 

___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: ix(intel) vs mlxen(mellanox) 10Gb performance

2015-08-18 Thread Slawa Olhovchenkov
On Tue, Aug 18, 2015 at 05:09:41PM +0300, Daniel Braniss wrote:

> sorry, it's been a tough day, we had a major meltdown, caused by a faulty 
> gbic :-(
> anyways, could you tell me what to do?
> comment out, fix the off by one?
> 
> the machine is not yet production.

Can you collect this information?
https://lists.freebsd.org/pipermail/freebsd-stable/2015-August/083113.html

And 'show interface' (or equivalent: error/collsion/events counters)
from both ports from HP8200.
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: Poor high-PPS performance of the 10G ixgbe(9) NIC/driver in FreeBSD 10.1

2015-08-18 Thread Maxim Sobolev
Yes, we've confirmed it's IXGBE_FDIR. That's good it comes disabled in 10.2.

Thanks everyone for constructive input!

-Max
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: Poor high-PPS performance of the 10G ixgbe(9) NIC/driver in FreeBSD 10.1

2015-08-18 Thread Adrian Chadd
you're welcome.

Someone should really add a release errata to 10.1 or something.


-a


On 18 August 2015 at 10:59, Maxim Sobolev  wrote:
> Yes, we've confirmed it's IXGBE_FDIR. That's good it comes disabled in 10.2.
>
> Thanks everyone for constructive input!
>
> -Max
> ___
> freebsd-net@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: pf and new interface

2015-08-18 Thread Reko Turja

Hmm does the:

set skip on (tap)

syntax work in this case? Basically parentheses around the alias should tell 
pf that the IP is volatile and can be either activated at later time or it 
can be dynamic via dhcp etc.


-Reko 


___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: Poor high-PPS performance of the 10G ixgbe(9) NIC/driver in FreeBSD 10.1

2015-08-18 Thread Glen Barber
On Tue, Aug 18, 2015 at 11:18:33AM -0700, hiren panchasara wrote:
> On 08/18/15 at 11:03P, Adrian Chadd wrote:
> > you're welcome.
> > 
> > Someone should really add a release errata to 10.1 or something.
> 
> Yes, I strongly feel the same. Adding gjb@ here to see how that can be
> done.
> 

Please send to re@.

Glen



pgpvB0gmSZj2e.pgp
Description: PGP signature


Re: pf and new interface

2015-08-18 Thread Andriy Gapon
On 18/08/2015 20:43, Reko Turja wrote:
> Hmm does the:
> 
> set skip on (tap)
> 
> syntax work in this case? Basically parentheses around the alias should
> tell pf that the IP is volatile and can be either activated at later
> time or it can be dynamic via dhcp etc.

I will check and follow up.

-- 
Andriy Gapon
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: Poor high-PPS performance of the 10G ixgbe(9) NIC/driver in FreeBSD 10.1

2015-08-18 Thread hiren panchasara
On 08/18/15 at 06:25P, Glen Barber wrote:
> On Tue, Aug 18, 2015 at 11:18:33AM -0700, hiren panchasara wrote:
> > On 08/18/15 at 11:03P, Adrian Chadd wrote:
> > > you're welcome.
> > > 
> > > Someone should really add a release errata to 10.1 or something.
> > 
> > Yes, I strongly feel the same. Adding gjb@ here to see how that can be
> > done.
> > 
> 
> Please send to re@.

Will do.

Thanks,
Hiren


pgpQSKgHhs0uS.pgp
Description: PGP signature


Re: Poor high-PPS performance of the 10G ixgbe(9) NIC/driver in FreeBSD 10.1

2015-08-18 Thread hiren panchasara
On 08/18/15 at 11:03P, Adrian Chadd wrote:
> you're welcome.
> 
> Someone should really add a release errata to 10.1 or something.

Yes, I strongly feel the same. Adding gjb@ here to see how that can be
done.

Cheers,
Hiren
> 
> 
> -a
> 
> 
> On 18 August 2015 at 10:59, Maxim Sobolev  wrote:
> > Yes, we've confirmed it's IXGBE_FDIR. That's good it comes disabled in 10.2.
> >
> > Thanks everyone for constructive input!
> >
> > -Max
> > ___
> > freebsd-net@freebsd.org mailing list
> > https://lists.freebsd.org/mailman/listinfo/freebsd-net
> > To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
> ___
> freebsd-net@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


pgpzWgzxcRlRr.pgp
Description: PGP signature


Re: ix(intel) vs mlxen(mellanox) 10Gb performance

2015-08-18 Thread Rick Macklem
Hans Petter Selasky wrote:
> On 08/18/15 14:53, Rick Macklem wrote:
> > 2572 ifp->if_hw_tsomax = 65518;
> >> >2573   ifp->if_hw_tsomaxsegcount = 
> >> >IXGBE_82599_SCATTER;
> >> >2574   ifp->if_hw_tsomaxsegsize = 2048;
> 
> Hi,
> 
> If IXGBE_82599_SCATTER is the maximum scatter/gather entries the
> hardware can do, remember to subtract one fragment for the TCP/IP-header
> mbuf!
> 
Ouch! Yes, I now see that the code that counts the # of mbufs is before the
code that adds the tcp/ip header mbuf.

In my opinion, this should be fixed by setting if_hw_tsomaxsegcount to whatever
the driver provides - 1. It is not the driver's responsibility to know if a 
tcp/ip
header mbuf will be added and is a lot less confusing that expecting the driver
author to know to subtract one. (I had mistakenly thought that tcp_output() had
added the tc/ip header mbuf before the loop that counts mbufs in the list. Btw,
this tcp/ip header mbuf also has leading space for the MAC layer header.)

> I think there is an off-by-one here:
> 
> ifp->if_hw_tsomax = 65518;
> ifp->if_hw_tsomaxsegcount = IXGBE_82599_SCATTER - 1;
> ifp->if_hw_tsomaxsegsize = 2048;
> 
> Refer to:
> 
> >  *
> >  * NOTE: The TSO limits only apply to the data payload part of
> >  * a TCP/IP packet. That means there is no need to subtract
> >  * space for ethernet-, vlan-, IP- or TCP- headers from the
> >  * TSO limits unless the hardware driver in question requires
> >  * so.
> 
This comment suggests that the driver author doesn't need to do this.

However, unless this is fixed in tcp_output(), the above patch should be
applied to the driver.
> In sys/net/if_var.h
> 
> Thank you!
> 
> --HPS
> 
The problem I see is that, after doing the calculation of how many mbufs can
be in the TSO segment, the code in tcp_output() will have calculated a value
for "len" that will always be less that "tp->t_maxopd - optlen" when the
if_hw_tsosegcount limit has been hit (see where it does a "break;" out of
the while loop).
--> This does not imply "too many small fragments" for NFS, just that the
driver's transmit segment limit has been reached, where most of them
are mbuf clusters, but not the first ones.
As such the code:
/*
 * In case there are too many small fragments
 * don't use TSO:
 */
if (len <= max_len) {
len = max_len;
sendalot = 1;
tso = 0;
}
Will always happen for this case and "tso" gets set to 0. Not what we want to
happen, imho.
The above code block was what I suggested should be commented out or deleted
for the test.

It appears you should also add the "- 1" in the driver sys/dev/ixgbe/if_ix.c.

rick

> ___
> freebsd-net@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
> 
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: ix(intel) vs mlxen(mellanox) 10Gb performance

2015-08-18 Thread Rick Macklem
Hans Petter Selasky wrote:
> On 08/18/15 14:53, Rick Macklem wrote:
> > If this is just a test machine, maybe you could test with these lines (at
> > about #880)
> > in sys/netinet/tcp_output.c commented out? (It looks to me like this will
> > disable TSO
> > for almost all the NFS writes.)
> > - around line #880 in sys/netinet/tcp_output.c:
> > /*
> >  * In case there are too many small fragments
> >  * don't use TSO:
> >  */
> > if (len <= max_len) {
> > len = max_len;
> > sendalot = 1;
> > tso = 0;
> > }
> >
> > This was added along with the other stuff that did the
> > if_hw_tsomaxsegcount, etc and I
> > never noticed it until now (not my patch).
> 
> FYI:
> 
> These lines are needed by other hardware, like the mlxen driver. If you
> remove them mlxen will start doing m_defrag(). I believe if you set the
> correct parameters in the "struct ifnet" for the TSO size/count limits
> this problem will go away. If you print the "len" and "max_len" and also
> the cases where TSO limits are reached, you'll see what parameter is
> triggering it and needs to be increased.
> 
Well, if the driver isn't setting if_hw_tsomaxsegcount correctly, then it
is the driver that needs to be fixed.
Having the above code block disable TSO for all of the NFS writes, including
the ones that set if_hw_tsomaxsegcount correctly doesn't make sense to me.
If the driver authors don't set these, the drivers do lots of m_defrag()
calls. I have posted more than once to freebsd-net@ asking the driver authors
to set these and some now have. (I can't do it, because I don't have the
hardware to test it with.)

I do think that most/all of them don't subtract 1 for the tcp/ip header and
I don't think they should be expected to, since the driver isn't supposed to
worry about the protocol at that level.
--> I think tcp_output() should subtract one from the if_hw_tsomaxsegcount
provided by the driver to handle this, since it chooses to count mbufs
(the while() loop at around line #825 in sys/netinet/tcp_output.c.)
before it prepends the tcp/ip header mbuf.

rick

> --HPS
> 
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Panic [page fault] in _ieee80211_crypto_delkey(): stable/10/amd64 @r286878

2015-08-18 Thread David Wolfskill
I was minding my own business in a staff meeting this afternoon, and my
laptop rebooted; seems it got a panic.  I've copied the core.txt.0 file
to , along with a
verbose dmesg.boot from this morning and output of "pciconf -l -v".

This was running:
FreeBSD localhost 10.2-STABLE FreeBSD 10.2-STABLE #122  
r286878M/286880:1002500: Tue Aug 18 04:06:33 PDT 2015 
r...@g1-252.catwhisker.org:/common/S1/obj/usr/src/sys/CANARY  amd64

Excerpts from core.txt.0:

panic: page fault
...
Unread portion of the kernel message buffer:
panic: page fault
cpuid = 2
KDB: stack backtrace:
#0 0x80946e00 at kdb_backtrace+0x60
#1 0x8090a9e6 at vpanic+0x126
#2 0x8090a8b3 at panic+0x43
#3 0x80c8467b at trap_fatal+0x36b
#4 0x80c8497d at trap_pfault+0x2ed
#5 0x80c8401a at trap+0x47a
#6 0x80c6a1b2 at calltrap+0x8
#7 0x809eff5e at ieee80211_crypto_delkey+0x1e
#8 0x80a04d45 at ieee80211_ioctl_delkey+0x65
#9 0x80a03bd2 at ieee80211_ioctl_set80211+0x572
#10 0x80a2c323 at in_control+0x203
#11 0x809cd57b at ifioctl+0x15eb
#12 0x8095ecf5 at kern_ioctl+0x255
#13 0x8095e9f0 at sys_ioctl+0x140
#14 0x80c84f97 at amd64_syscall+0x357
#15 0x80c6a49b at Xfast_syscall+0xfb
Uptime: 9h45m0s
...
Loaded symbols for /usr/local/modules/rtc.ko
#0  doadump (textdump=) at pcpu.h:219
219 pcpu.h: No such file or directory.
in pcpu.h
(kgdb) #0  doadump (textdump=) at pcpu.h:219
#1  0x8090a642 in kern_reboot (howto=260)
at /usr/src/sys/kern/kern_shutdown.c:451
#2  0x8090aa25 in vpanic (fmt=, 
ap=) at /usr/src/sys/kern/kern_shutdown.c:758
#3  0x8090a8b3 in panic (fmt=0x0)
at /usr/src/sys/kern/kern_shutdown.c:687
#4  0x80c8467b in trap_fatal (frame=, 
eva=) at /usr/src/sys/amd64/amd64/trap.c:851
#5  0x80c8497d in trap_pfault (frame=0xfe060d88b510, 
usermode=) at /usr/src/sys/amd64/amd64/trap.c:674
#6  0x80c8401a in trap (frame=0xfe060d88b510)
at /usr/src/sys/amd64/amd64/trap.c:440
#7  0x80c6a1b2 in calltrap ()
at /usr/src/sys/amd64/amd64/exception.S:236
#8  0x809f003a in _ieee80211_crypto_delkey ()
at /usr/src/sys/net80211/ieee80211_crypto.c:105
#9  0x809eff5e in ieee80211_crypto_delkey (vap=0xfe03d907, 
key=0xfe03d9070800) at /usr/src/sys/net80211/ieee80211_crypto.c:461
#10 0x80a04d45 in ieee80211_ioctl_delkey (vap=0xfe03d907, 
ireq=)
at /usr/src/sys/net80211/ieee80211_ioctl.c:1252
#11 0x80a03bd2 in ieee80211_ioctl_set80211 ()
at /usr/src/sys/net80211/ieee80211_ioctl.c:2814
#12 0x80a2c323 in in_control (so=, 
cmd=9214790412651315593, data=0xfe060d88bb80 "", ifp=0x3, 
td=) at /usr/src/sys/netinet/in.c:308
#13 0x809cd57b in ifioctl (so=0xfe03d9070800, cmd=2149607914, 
data=0xfe060d88b8e0 "wlan0", td=0xf80170abb940)
at /usr/src/sys/net/if.c:2770
#14 0x8095ecf5 in kern_ioctl (td=0xf80170abb940, 
fd=, com=18446741891212314624) at file.h:320
#15 0x8095e9f0 in sys_ioctl (td=0xf80170abb940, 
uap=0xfe060d88ba40) at /usr/src/sys/kern/sys_generic.c:718
#16 0x80c84f97 in amd64_syscall (td=0xf80170abb940, traced=0)
at subr_syscall.c:134
#17 0x80c6a49b in Xfast_syscall ()
at /usr/src/sys/amd64/amd64/exception.S:396
#18 0x0008012a2f9a in ?? ()
Previous frame inner to this frame (corrupt stack?)
Current language:  auto; currently minimal
(kgdb) 


Physical 802.11 hardware is iwn(4).

I can copy the vmcore.0 file itself after I get home -- it's ~625MB,
and I'd rather not try to get that through over a WAN before I need
to catch the shuttle to get home. :-}

Peace,
david
-- 
David H. Wolfskill  da...@catwhisker.org
Those who would murder in the name of God or prophet are blasphemous cowards.

See http://www.catwhisker.org/~david/publickey.gpg for my public key.


pgp2ZDs0iFNxY.pgp
Description: PGP signature


Re: ix(intel) vs mlxen(mellanox) 10Gb performance

2015-08-18 Thread Rick Macklem
Daniel Braniss wrote:
> 
> > On Aug 18, 2015, at 12:49 AM, Rick Macklem  wrote:
> > 
> > Daniel Braniss wrote:
> >> 
> >>> On Aug 17, 2015, at 3:21 PM, Rick Macklem  wrote:
> >>> 
> >>> Daniel Braniss wrote:
>  
> > On Aug 17, 2015, at 1:41 PM, Christopher Forgeron
> > 
> > wrote:
> > 
> > FYI, I can regularly hit 9.3 Gib/s with my Intel X520-DA2's and FreeBSD
> > 10.1. Before 10.1 it was less.
> > 
>  
>  this is NOT iperf/3 where i do get close to wire speed,
>  it’s NFS writes, i.e., almost real work :-)
>  
> > I used to tweak the card settings, but now it's just stock. You may
> > want
> > to
> > check your settings, the Mellanox may just have better defaults for
> > your
> > switch.
> > 
> >>> Have you tried disabling TSO for the Intel? With TSO enabled, it will be
> >>> copying
> >>> every transmitted mbuf chain to a new chain of mbuf clusters via.
> >>> m_defrag() when
> >>> TSO is enabled. (Assuming you aren't an 82598 chip. Most seem to be the
> >>> 82599 chip
> >>> these days?)
> >>> 
Oops, I think I screwed up. It looks like t_maxopd is limited to somewhat less
than the mtu.

If that is the case, the code block wouldn't do what I thought it would do.

However, if_hw_tsomaxsegcount does need to be one less than the limit for the
driver, since the tcp/ip header isn't yet prepended when it is counted.

I think the code in tcp_output() should subtract 1, but you can change it in
the driver to test this.

Thanks for doing this, rick

> >> 
> >> hi Rick
> >> 
> >> how can i check the chip?
> >> 
> > Haven't a clue. Does "dmesg" tell you? (To be honest, since disabling TSO
> > helped,
> > I'll bet you don't have a 82598.)
> > 
> >>> This has been fixed in the driver very recently, but those fixes won't be
> >>> in 10.1.
> >>> 
> >>> rick
> >>> ps: If you could test with 10.2, it would be interesting to see how the
> >>> ix
> >>> does with
> >>>   the current driver fixes in it?
> >> 
> >> I new TSO was involved!
> >> ok, firstly, it’s 10.2 stable.
> >> with TSO enabled, ix is bad, around 64MGB/s.
> >> disabling TSO it’s better, around 130
> >> 
> > Hmm, could you check to see of these lines are in sys/dev/ixgbe/if_ix.c at
> > around
> > line#2500?
> >  /* TSO parameters */
> > 2572 ifp->if_hw_tsomax = 65518;
> > 2573 ifp->if_hw_tsomaxsegcount = 
> > IXGBE_82599_SCATTER;
> > 2574 ifp->if_hw_tsomaxsegsize = 2048;
> > 
> > They are in stable/10. I didn't look at releng/10.2. (And if they're in a
> > #ifdef
> > for FreeBSD11, take the #ifdef away.)
> > If they are there and not ifdef'd, I can't explain why disabling TSO would
> > help.
> > Once TSO is fixed so that it handles the 64K transmit segments without
> > copying all
> > the mbufs, I suspect you might get better perf. with it enabled?
> > 
> 
> this is 10.2 :
> they are on lines  2509-2511 and I don’t see any #ifdefs around it.
> 
> the plot thickens :-)
> 
> danny
> 
> > Good luck with it, rick
> > 
> >> still, mlxen0 is about 250! with and without TSO
> >> 
> >> 
> >>> 
> > On Mon, Aug 17, 2015 at 6:41 AM, Slawa Olhovchenkov  > > wrote:
> > On Mon, Aug 17, 2015 at 10:27:41AM +0300, Daniel Braniss wrote:
> > 
> >> hi,
> >> I have a host (Dell R730) with both cards, connected to an HP8200
> >> switch at 10Gb.
> >> when writing to the same storage (netapp) this is what I get:
> >> ix0:~130MGB/s
> >> mlxen0  ~330MGB/s
> >> this is via nfs/tcpv3
> >> 
> >> I can get similar (bad) performance with the mellanox if I
> >> increase
> >> the file size
> >> to 512MGB.
> > 
> > Look like mellanox have internal beffer for caching and do ACK
> > acclerating.
> > 
> >> so at face value, it seems the mlxen does a better use of
> >> resources
> >> than the intel.
> >> Any ideas how to improve ix/intel's performance?
> > 
> > Are you sure about netapp performance?
> > ___
> > freebsd-net@freebsd.org  mailing list
> > https://lists.freebsd.org/mailman/listinfo/freebsd-net
> > 
> > To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org
> > "
> > 
>  
>  ___
>  freebsd-sta...@freebsd.org mailing list
>  https://lists.freebsd.org/mailman/listinfo/freebsd-stable
>  To unsubscribe, send any mail to
>  "freebsd-stable-unsubscr...@freebsd.org"
> >> 
> >> ___
> >> freebsd-sta...@freebsd.org mailing list
> >> https://lists.freebsd.org/mailman/listinfo/freebsd-stable
> >> To unsubscribe, send any

Re: ix(intel) vs mlxen(mellanox) 10Gb performance

2015-08-18 Thread Hans Petter Selasky

On 08/18/15 23:54, Rick Macklem wrote:

Ouch! Yes, I now see that the code that counts the # of mbufs is before the
code that adds the tcp/ip header mbuf.

In my opinion, this should be fixed by setting if_hw_tsomaxsegcount to whatever
the driver provides - 1. It is not the driver's responsibility to know if a 
tcp/ip
header mbuf will be added and is a lot less confusing that expecting the driver
author to know to subtract one. (I had mistakenly thought that tcp_output() had
added the tc/ip header mbuf before the loop that counts mbufs in the list. Btw,
this tcp/ip header mbuf also has leading space for the MAC layer header.)



Hi Rick,

Your question is good. With the Mellanox hardware we have separate 
so-called inline data space for the TCP/IP headers, so if the TCP stack 
subtracts something, then we would need to add something to the limit, 
because then the scatter gather list is only used for the data part.


Maybe it can be controlled by some kind of flag, if all the three TSO 
limits should include the TCP/IP/ethernet headers too. I'm pretty sure 
we want both versions.


--HPS
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"