Re: Linux netmap memory allocation

Charlie Smurthwaite Tue, 02 Jan 2018 15:08:37 -0800

Hi Vincenzo,

I am using poll(), and I am not specifying NETMAP_NO_TX_POLL, and have found 
that sometimes frames and sent only when the TX buffer is full, and sometimes 
they are not sent at all. They are never sent as expected on every invocation 
of poll(). If I run ioctl(NIOCTXSYNC) manually, everything works correctly. I 
assume I have simply missed something from my nmreq.

I don't think you have missed anything within nmreq. I see that you are waiting for
POLLIN only (and this is right in your router case), so poll() will actually invoke
txsync on interface #i only when netmap intercepts an RX or TX interrupt on interface #i.
This means that packets may stall for long time in the TX rings if you don't call
ioctl(TXSYNC). The manual is not wrong, however. You can look at the apps/bridge/bridge.c
example to understand where this "poll automatically calls txsync" thing is
useful.
Thank you for the clarification. I have now altered my code to call TXSYNC
after each iteration, but only if I have modified the TX ring for that
interface. This seems to work perfectly. The patch can be seen at
https://github.com/catphish/netmap-router/commit/2961ab16f14a8b2a2561c9d73f73857e523cc177

You also mentioned: "whether netmap calls or does not call txsync/rxsync on certain
rings depends on the parameters passed to nm_open()". I do not use the nm_open
helper method, but I am extremely interested to know what parameters would affect this
bahaviour, as this would seem very relevant to my problem.

Yes, we do not normally use the low level interface (ioctl(REGIF)), because it's just
simpler to use the nm_open() interface. Within the first parameter of nm_open() you can
specify to open just one RX/TX rings couple, e.g. with "enp1f0s1-3". Then you
usually want to mmap() just once (as you do in your program); with nm_open(), you do that
with the NM_OPEN_NO_MMAP flag.
I did look at nm_open, and even read the source of nm_open to discover how to
implement the shared memory, but (for no good reason) I preferred to set up the
interface manually.

If you are interested or if it helps explain my question, my complete code
(hopefully well commented but far from complete) can be found here:
https://github.com/catphish/netmap-router/blob/58a9b957c19b0a012088c491bd58bc3161a56ff1/router.c

Specifically, if the ioctl call at line 92 is removed, the code does not work
(packets are not transmitted, or are only transmitted when the buffer is full,
which of these 2 behaviours seems to be random), however I would expect it to
work because I do not specify NETMAP_NO_TX_POLL, and I would therefore hope
that the poll() call on line 80 would have the same effect.

Yes, that depends on when netmap_poll() is called by the kernel, that depends
on when something is ready for receive on the file descriptor.
Looking at your program, I think you need to call ioctl(TXSYNC), at least
because you don't want to introduce artificial/unbounded latency. However,
since these calls are expensive, you could use them only when necessary (e.g.
when you nm_ring_space(txring) == 0 or when you actually forwarded some packets
on txring.
Per the patch above I now call TXSYNC on an interface only after pushing a
batch of packets to it and this seems to work perfectly, at least with a good
balance between performance and latency. If nm_ring_space(txring) == 0 I just
drop frames until the next batch. I don't TXSYNC part way through a batch, it
hasn't yet seemed necessary, but I may need to look into this later.

I'm running this on a 6-core 2.8GHz Xeon with a 4-port i350-T4 NIC. I thought
I'd just post some stats of the performance I observe using my code (excluding
the routing table lookup as this isn't relevant to netmap). Not really looking
for any advice here, just thought I'd share my results.

All examples are with 1.488Mpps (1 x 1Gbps) input and no packet loss observed:
1 thread - CPU usage = 100%, batch size = 4
2 thread - CPU usage = 54% (27% x 2), batch size = 12
4 thread - CPU usage = 98% (25% x 4), batch size = 8
6 thread - CPU usage = 124% (21% x 6), batch size = 8

And again with 2.976Mpps (2 x 1Gbps) input and no packet loss observed:
1 thread - CPU usage = 100%, batch size = 12
2 thread - CPU usage = 68% (34% x 2), batch size = 21
4 thread - CPU usage = 100% (25% x 4), batch size = 17
6 thread - CPU usage = 105% (18% x 6), batch size = 16

These results seem excellent and demonstrate that netmap is scaling as expected
with both threads and packet volume. The higher thread count will be more
beneficial when I am doing more processing on each packet.

I hope this all makes sense, and again, I hope I have simply missed something
from the nmreq i pass to NIOCREGIF.

It is worth mentioning that with the exception of this problem / confusion, I
am getting extremely good results from this code and netmap in general.

That's nice to hear :)
Your program looks simple enough that we could even add it to the examples (as
an example of routing logic).
I'd be very happy to contribute to the documentation in any way that may be
helpful. I have added a permissive licence to my Github repository just in case
my code of of use to anyone else. It is currently somewhat incomplete as an
IPv4 router as it doesn't update MAC addresses on frames before forwarding
them, and because the interface names are hardcoded, but when it's more
complete I'd be very happy for it to be contributed to the examples. Of course
anyone is free to use my code for any purpose too.

Thanks for all your assistance! I'm happy enough with this that I will move on
to looking at my IP routing code.

Charlie

Charlie Smurthwaite
Technical Director

tel. email. charlie@atech.media<mailto:charlie@atech.media> web.
https://atech.media

This e-mail has been sent by aTech Media Limited (or one of its assoicated
group companys, Dial 9 Communications Limited or Viaduct Hosting Limited). Its
contents are confidential therefore if you have received this message in error,
we would appreciate it if you could let us know and delete the message. aTech
Media Limited is a UK limited company, registration number 5523199. Dial 9
Communications Limited is a UK limited company, registration number 7740921.
Viaduct Hosting Limited is a UK limited company, registration number 8514362.
All companies are registered at Unit 9 Winchester Place, North Street, Poole,
Dorset, BH15 1NX.
_______________________________________________
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: Linux netmap memory allocation

Reply via email to