On 17 August 2016 at 08:43, Ben RUBSON wrote:
>
>> On 17 Aug 2016, at 17:38, Adrian Chadd wrote:
>>
>> [snip]
>>
>> ok, so this is what I was seeing when I was working on this stuff last.
>>
>> The big abusers are:
>>
>> * so_snd lock, for TX'ing producer/consumer socket data
>> * tcp stack pcb l
> On 17 Aug 2016, at 17:38, Adrian Chadd wrote:
>
> [snip]
>
> ok, so this is what I was seeing when I was working on this stuff last.
>
> The big abusers are:
>
> * so_snd lock, for TX'ing producer/consumer socket data
> * tcp stack pcb locking (which rss tries to work around, but it again
>
[snip]
ok, so this is what I was seeing when I was working on this stuff last.
The big abusers are:
* so_snd lock, for TX'ing producer/consumer socket data
* tcp stack pcb locking (which rss tries to work around, but it again
doesn't help producer/consumer locking, only multiple sockets)
* for s
> On 15 Aug 2016, at 16:49, Ben RUBSON wrote:
>
>> On 12 Aug 2016, at 00:52, Adrian Chadd wrote:
>>
>> Which ones of these hit the line rate comfortably?
>
> So Adrian, I ran tests again using FreeBSD 11-RC1.
> I put iperf throughput in result files (so that we can classify them), as
> well
> On 16 Aug 2016, at 21:36, Adrian Chadd wrote:
>
> On 16 August 2016 at 02:58, Ben RUBSON wrote:
>>
>>> On 16 Aug 2016, at 03:45, Adrian Chadd wrote:
>>>
>>> Hi,
>>>
>>> ok, can you try 5) but also running with the interrupt threads pinned to
>>> CPU 1?
>>
>> What do you mean by interrup
On 16 August 2016 at 02:58, Ben RUBSON wrote:
>
>> On 16 Aug 2016, at 03:45, Adrian Chadd wrote:
>>
>> Hi,
>>
>> ok, can you try 5) but also running with the interrupt threads pinned to CPU
>> 1?
>
> What do you mean by interrupt threads ?
>
> Perhaps you mean the NIC interrupts ?
> In this case
> On 16 Aug 2016, at 03:45, Adrian Chadd wrote:
>
> Hi,
>
> ok, can you try 5) but also running with the interrupt threads pinned to CPU
> 1?
What do you mean by interrupt threads ?
Perhaps you mean the NIC interrupts ?
In this case see 6) and 7) where NIC IRQs are pinned to CPUs 0-11 (6) an
Hi,
ok, can you try 5) but also running with the interrupt threads pinned to CPU 1?
It looks like the interrupt threads are running on CPU 0, and my
/guess/ (looking at the CPU usage distributions) that sometimes the
userland bits run on the same CPU or numa domain as the interrupt
bits, and it l
> On 12 Aug 2016, at 00:52, Adrian Chadd wrote:
>
> Which ones of these hit the line rate comfortably?
So Adrian, I ran tests again using FreeBSD 11-RC1.
I put iperf throughput in result files (so that we can classify them), as well
as top -P ALL and pcm-memory.x.
iperf results : columns 3&4 a
Which ones of these hit the line rate comfortably?
-a
On 11 August 2016 at 15:35, Ben RUBSON wrote:
>
>> On 11 Aug 2016, at 18:36, Adrian Chadd wrote:
>>
>> Hi!
>>
>> mlx4_core0: mem
>> 0xfbe0-0xfbef,0xfb00-0xfb7f irq 64 at device 0.0
>> numa-domain 1 on pci16
>> mlx4_core:
> On 11 Aug 2016, at 18:36, Adrian Chadd wrote:
>
> Hi!
>
> mlx4_core0: mem
> 0xfbe0-0xfbef,0xfb00-0xfb7f irq 64 at device 0.0
> numa-domain 1 on pci16
> mlx4_core: Initializing mlx4_core: Mellanox ConnectX VPI driver v2.1.6
> (Aug 11 2016)
>
> so the NIC is in numa-domain 1.
adrian did mean fixed-domain-rr. :-P sorry!
(Sorry, needed to update my NUMA boxes, things "changed" since I wrote this.)
-a
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail t
On 08/11/16 12:54 PM, Ben RUBSON wrote:
>
>> On 11 Aug 2016, at 19:51, Ben RUBSON wrote:
>>
>>
>>> On 11 Aug 2016, at 18:36, Adrian Chadd wrote:
>>>
>>> Hi!
>>
>> Hi Adrian,
>>
>>> mlx4_core0: mem
>>> 0xfbe0-0xfbef,0xfb00-0xfb7f irq 64 at device 0.0
>>> numa-domain 1 on pci16
>>
> On 11 Aug 2016, at 18:36, Adrian Chadd wrote:
>
> Hi!
>
> mlx4_core0: mem
> 0xfbe0-0xfbef,0xfb00-0xfb7f irq 64 at device 0.0
> numa-domain 1 on pci16
> mlx4_core: Initializing mlx4_core: Mellanox ConnectX VPI driver v2.1.6
> (Aug 11 2016)
>
> so the NIC is in numa-domain 1.
> On 11 Aug 2016, at 19:51, Ben RUBSON wrote:
>
>
>> On 11 Aug 2016, at 18:36, Adrian Chadd wrote:
>>
>> Hi!
>
> Hi Adrian,
>
>> mlx4_core0: mem
>> 0xfbe0-0xfbef,0xfb00-0xfb7f irq 64 at device 0.0
>> numa-domain 1 on pci16
>> mlx4_core: Initializing mlx4_core: Mellanox Conn
> On 11 Aug 2016, at 18:36, Adrian Chadd wrote:
>
> Hi!
Hi Adrian,
> mlx4_core0: mem
> 0xfbe0-0xfbef,0xfb00-0xfb7f irq 64 at device 0.0
> numa-domain 1 on pci16
> mlx4_core: Initializing mlx4_core: Mellanox ConnectX VPI driver v2.1.6
> (Aug 11 2016)
>
> so the NIC is in numa-
Hi!
mlx4_core0: mem
0xfbe0-0xfbef,0xfb00-0xfb7f irq 64 at device 0.0
numa-domain 1 on pci16
mlx4_core: Initializing mlx4_core: Mellanox ConnectX VPI driver v2.1.6
(Aug 11 2016)
so the NIC is in numa-domain 1. Try pinning the worker threads to
numa-domain 1 when you run the test:
> On 11 Aug 2016, at 00:11, Adrian Chadd wrote:
>
> hi,
>
> ok, lets start by getting the NUMA bits into the kernel so you can
> mess with things.
>
> add this to the kernel
>
> options MAXMEMDOM=8
> (which hopefully is enough)
> options VM_NUMA_ALLOC
> options DEVICE_NUMA
>
> Then reboot an
On 10 August 2016 at 12:50, Ben RUBSON wrote:
>
>> On 10 Aug 2016, at 21:47, Adrian Chadd wrote:
>>
>> hi,
>>
>> yeah, I'd like you to do some further testing with NUMA. Are you able
>> to run freebsd-11 or -HEAD on these boxes?
>
> Hi Adrian,
>
> Yes I currently have 11 BETA3 running on them.
>
> On 10 Aug 2016, at 21:47, Adrian Chadd wrote:
>
> hi,
>
> yeah, I'd like you to do some further testing with NUMA. Are you able
> to run freebsd-11 or -HEAD on these boxes?
Hi Adrian,
Yes I currently have 11 BETA3 running on them.
I could also run BETA4.
Ben
__
hi,
yeah, I'd like you to do some further testing with NUMA. Are you able
to run freebsd-11 or -HEAD on these boxes?
-adrian
On 8 August 2016 at 07:01, Ben RUBSON wrote:
>
>> On 04 Aug 2016, at 11:40, Ben RUBSON wrote:
>>
>>
>>> On 02 Aug 2016, at 22:11, Ben RUBSON wrote:
>>>
On 02 Aug
> On 04 Aug 2016, at 11:40, Ben RUBSON wrote:
>
>
>> On 02 Aug 2016, at 22:11, Ben RUBSON wrote:
>>
>>> On 02 Aug 2016, at 21:35, Hans Petter Selasky wrote:
>>>
>>> The CX-3 driver doesn't bind the worker threads to specific CPU cores by
>>> default, so if your CPU has more than one so-cal
> On 05 Aug 2016, at 10:30, Hans Petter Selasky wrote:
>
> On 08/04/16 23:49, Ben RUBSON wrote:
>>>
>>> On 04 Aug 2016, at 20:15, Ryan Stone wrote:
>>>
>>> On Thu, Aug 4, 2016 at 11:33 AM, Ben RUBSON wrote:
>>> But even without RSS, I should be able to go up to 2x40Gbps, don't you
>>> think
On 08/04/16 23:49, Ben RUBSON wrote:
On 04 Aug 2016, at 20:15, Ryan Stone wrote:
On Thu, Aug 4, 2016 at 11:33 AM, Ben RUBSON wrote:
But even without RSS, I should be able to go up to 2x40Gbps, don't you think so
?
Nobody already did this ?
Try this patch
(...)
I also just tested the NODEB
>
> On 04 Aug 2016, at 20:15, Ryan Stone wrote:
>
> On Thu, Aug 4, 2016 at 11:33 AM, Ben RUBSON wrote:
> But even without RSS, I should be able to go up to 2x40Gbps, don't you think
> so ?
> Nobody already did this ?
>
> Try this patch
> (...)
I also just tested the NODEBUG kernel but it did
> On 04 Aug 2016, at 20:15, Ryan Stone wrote:
>
> On Thu, Aug 4, 2016 at 11:33 AM, Ben RUBSON wrote:
> But even without RSS, I should be able to go up to 2x40Gbps, don't you think
> so ?
> Nobody already did this ?
>
> Try this patch
> (...)
I also just tested the NODEBUG kernel but I did no
> On 04 Aug 2016, at 20:15, Ryan Stone wrote:
>
> On Thu, Aug 4, 2016 at 11:33 AM, Ben RUBSON wrote:
> But even without RSS, I should be able to go up to 2x40Gbps, don't you think
> so ?
> Nobody already did this ?
>
> Try this patch, which should improve performance when multiple TCP streams
On Thu, Aug 4, 2016 at 11:33 AM, Ben RUBSON wrote:
> But even without RSS, I should be able to go up to 2x40Gbps, don't you
> think so ?
> Nobody already did this ?
>
Try this patch, which should improve performance when multiple TCP streams
are running in parallel over an mlx4_en port:
https:/
> On 04 Aug 2016, at 17:33, Hans Petter Selasky wrote:
>
> On 08/04/16 17:24, Ben RUBSON wrote:
>>
>>> On 04 Aug 2016, at 11:40, Ben RUBSON wrote:
>>>
On 02 Aug 2016, at 22:11, Ben RUBSON wrote:
> On 02 Aug 2016, at 21:35, Hans Petter Selasky wrote:
>
> The CX-3 driv
On 08/04/16 17:24, Ben RUBSON wrote:
On 04 Aug 2016, at 11:40, Ben RUBSON wrote:
On 02 Aug 2016, at 22:11, Ben RUBSON wrote:
On 02 Aug 2016, at 21:35, Hans Petter Selasky wrote:
The CX-3 driver doesn't bind the worker threads to specific CPU cores by default, so if
your CPU has more th
> On 04 Aug 2016, at 11:40, Ben RUBSON wrote:
>
>> On 02 Aug 2016, at 22:11, Ben RUBSON wrote:
>>
>>> On 02 Aug 2016, at 21:35, Hans Petter Selasky wrote:
>>>
>>> The CX-3 driver doesn't bind the worker threads to specific CPU cores by
>>> default, so if your CPU has more than one so-called
> On 02 Aug 2016, at 22:11, Ben RUBSON wrote:
>
>> On 02 Aug 2016, at 21:35, Hans Petter Selasky wrote:
>>
>> The CX-3 driver doesn't bind the worker threads to specific CPU cores by
>> default, so if your CPU has more than one so-called numa, you'll end up that
>> the bottle-neck is the hig
> On 03 Aug 2016, at 20:02, Hans Petter Selasky wrote:
>
> The mlx4 send and receive queues have each their set of taskqueues. Look in
> output from "ps auxww".
I can't find them, I even unloaded/reloaded the driver in order to catch the
differences, but I did not found any relevant process.
On 08/03/16 18:57, Ben RUBSON wrote:
taskqueue threads ?
The mlx4 send and receive queues have each their set of taskqueues. Look
in output from "ps auxww".
--HPS
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/f
> On 02 Aug 2016, at 21:35, Hans Petter Selasky wrote:
>
> The CX-3 driver doesn't bind the worker threads to specific CPU cores by
> default, so if your CPU has more than one so-called numa, you'll end up that
> the bottle-neck is the high-speed link between the CPU cores and not the
> card.
> On 03 Aug 2016, at 04:32, Eugene Grosbein wrote:
>
> If you have gateway_enable="YES" (sysctl net.inet.ip.forwarding=1)
> then try to disable this forwarding setting and rerun your tests to compare
> results.
Thank you Eugene for this, but net.inet.ip.forwarding is disabled by default
and I
03.08.2016 1:43, Ben RUBSON пишет:
Hello,
I'm trying to reach the 40Gb/s max throughtput between 2 hosts running a
ConnectX-3 Mellanox network adapter.
If you have gateway_enable="YES" (sysctl net.inet.ip.forwarding=1)
then try to disable this forwarding setting and rerun your tests to compar
> On 02 Aug 2016, at 21:35, Hans Petter Selasky wrote:
>
> Hi,
Thank you for your answer Hans Petter !
> The CX-3 driver doesn't bind the worker threads to specific CPU cores by
> default, so if your CPU has more than one so-called numa, you'll end up that
> the bottle-neck is the high-speed
On 08/02/16 20:43, Ben RUBSON wrote:
Hello,
I'm trying to reach the 40Gb/s max throughtput between 2 hosts running a
ConnectX-3 Mellanox network adapter.
FreeBSD 10.3 just installed, last updates performed.
Network adapters running last firmwares / last drivers.
No workload at all, just iPerf
Hello,
I'm trying to reach the 40Gb/s max throughtput between 2 hosts running a
ConnectX-3 Mellanox network adapter.
FreeBSD 10.3 just installed, last updates performed.
Network adapters running last firmwares / last drivers.
No workload at all, just iPerf as the benchmark tool.
### Step 1 :
I
40 matches
Mail list logo