On 12/8/11 1:29 PM, Luigi Rizzo wrote:
On Thu, Dec 08, 2011 at 12:49:18PM -0500, John Baldwin wrote:
On 12/5/11 2:31 PM, Luigi Rizzo wrote:
On Mon, Dec 05, 2011 at 07:38:54PM +0100, Marius Strobl wrote:
On Mon, Dec 05, 2011 at 03:33:14PM +0000, Luigi Rizzo wrote:
...
+#ifdef DEV_NETMAP
+               if (slot) {
+                       int si = i + na->tx_rings[txr->me].nkr_hwofs;
+                       void *addr;
+
+                       if (si>= na->num_tx_desc)
+                               si -= na->num_tx_desc;
+                       addr = NMB(slot + si);
+                       txr->tx_base[i].buffer_addr =
+                           htole64(vtophys(addr));
+                       /* reload the map for netmap mode */
+                       netmap_load_map(txr->txtag,
+                           txbuf->map, addr, na->buff_size);
+               }
+#endif /* DEV_NETMAP */

Can these vtophys(9) usages be fixed to use bus_dma(9) instead so netmap
works with bounce buffers, IOMMUs etc?

maybe. Can you suggest how to change it ?

Consider that (not here but in other places) vtophys() is called
in a time-critical loop so performance matters a lot. As long as i
can compute the physical address in advance and cache it in my own
array, i suppose that should be fine (in which case the calls to
vtophys(addr) would become NMPB(slot + si) where the NMPB() macro
would hide translations and checks.

For your use case, you probably don't want to be coping with bounce
buffers at all.  That is, if you are preallocating long-lived buffers
that keep getting reused while netmap is active that are allocated at
startup and free'd at teardown, you probably want to allocate buffers
that won't require bounce buffers.  That means you have to let the
drivers allocate the buffers (or give you a suitable bus_dma tag since
different devices have different addressing requirements, etc.).  You
could then use bus_dmamem_alloc() to allocate your buffers.

certainly i don't want to use netmap with bounce buffers.
I am not sure about IOMMU (I basically don't need it but maybe
using a compatible API is always nice). Right now i am allocating
a huge chunk of memory with contigmalloc.
Ryan Stone suggested that a plain malloc may work as well (as long
as i make sure that each buffer is within a single page).
Eventually I may want to play with cache alignment (also suggested by Ryan)
so allocate smaller chunks of contigmalloc'ed memory (say each buffer
is 2K - 64 bytes, then a contiguous block of 64K fits exactly 33 buffers).

Right, you should use bus_dmamem_alloc() instead. Internally it calls contigmalloc(). However, it will only allocate memory that your device can safely use (so if you are using a NIC that can only do 32-bit DMA addresses, it will allocate pages below 4GB avoiding bounce buffering). For the IOMMU case it will usually allocate memory from the region that is statically mapped into the IOMMU (at least some IOMMU's split the DVMA address space into two types: a mostly static region for things like descriptor rings, etc. and a dynamic region for transient I/O buffers like mbufs). If you want to design an interface that will work with a wide variety of hardware and not just ixgbe on Intel, then using bus_dma to manage DMA buffers is the correct approach.

Also, back to IOMMU, if the device is doing DMA into this buffer, then you _must_ use the IOMMU on certain platforms (e.g. sparc64, probably some embedded platforms where netmap might be very nice to have).

--
John Baldwin
_______________________________________________
svn-src-head@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/svn-src-head
To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"

Reply via email to