On 12.03.2013 16:00, Gleb Smirnoff wrote:
   Andre,

On Tue, Mar 12, 2013 at 03:23:16PM +0100, Andre Oppermann wrote:
A> On 12.03.2013 13:19, Gleb Smirnoff wrote:
A> > Author: glebius
A> > Date: Tue Mar 12 12:19:23 2013
A> > New Revision: 248196
A> > URL: http://svnweb.freebsd.org/changeset/base/248196
A> >
A> > Log:
A> >    Use m_get2() to get mbuf of appropriate size.
A>
A> The problem with m_get2() is that it will attempt to use jumbo mbufs
A> larger than PAGE_SIZE if the request size is sufficiently large.
A>
A> In light of the recent issues with jumbo > PAGE_SIZE allocations on
A> loaded systems I don't think it is appropriate to extend this practice
A> further.
A>
A> Anything that needs more than PAGE_SIZE (where PAGE_SIZE == 4K) mbuf
A> space should allocate an mbuf chain with m_getm2().  That is what we
A> have mbuf chains for.
A>
A> Sufficient availability of mbufs > PAGE_SIZE cannot be guaranteed after
A> some uptime and/or a loaded system due to memory fragmentation.  Allocation
A> can fail non-deterministically for mbufs > PAGE_SIZE long before smaller
A> mbufs become unavailable due to overall (kernel) memory exhaustion.
A>
A> Mbufs > PAGE_SIZE are not only contiguous in kernel address space but
A> also in real memory address space.  They are intended jumbo frames on
A> poor NIC's without scatter-capable DMA engines.
A>
A> When you put the m_get2() function proposal for review and comments I
A> already highlighted these issues and put forward serious concerns about
A> this.
A>
A> Please change m_get2() to limit itself to mbuf allocations <= PAGE_SIZE.

I already got similar patch in my queue. We have a lot of code that could
benefit from using both m_get2() and m_getm2() that are limited to not
use jumbos at all.

Indeed.  That's why I applaud your work in converting them.

If you are concerned about using jumbos that are > PAGE_SIZE, then I can
extend API in my patch.  ... done.

Patch attached.

The NFS code itself guarantees that it won't request > than MCLBYTES,
so using bare m_get2() here is safe. I can add flag there later for
clarity.

Using PAGE_SIZE clusters is perfectly fine and no flag to prevent that
is necessary.  In fact we're doing it for years on socket writes without
complaints (through m_getm2()).

However I think that m_get2() should never ever even try to attempt to
allocate mbuf clusters larger than PAGE_SIZE.  Not even with flags.

All mbufs > PAGE_SIZE should be exclusively and only ever be used by drivers
for NIC's with "challenged" DMA engines.  Possibly only available through a
dedicated API to prevent all other uses of it.

--
Andre

_______________________________________________
svn-src-head@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/svn-src-head
To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"

Reply via email to