Mbuf allocator performance (was Should we keep a cache of mbuf+cluster ready for use ?)

Luigi Rizzo Mon, 01 Jul 2002 23:50:45 -0700

[Bcc to -current as relevant there]

As a followup to my question below, i did some simple experiment
on a -current box acting as a router, using two "em" cards and
DEVICE_POLLING (just for stability).


The whole of the code is basically below -- the receiving
side tries to grab the mbuf+cluster from the free pool first,
and falls back to the standard method on failure; the sending
side tries to attach the freed buffer to the free pool
if it is the case.

There is a simplification here with respect to what could
go into an m_freem()/mgethdr_cluster() pair because the driver
is already holding a lock, on Giant in this case, so you do not need
further locking. Still, to come to the data points:

                                CURRENT         STABLE (*)

fastfwd=1, pool_max=0           276kpps         365 kpps
fastfwd=1, pool_max=50          355kpps         383 kpps

fastfwd=0, pool_max=0           195 pps         142 kpps
fastfwd=0, pool_max=50          227 kpps        146 kpps

(*) This version of STABLE that I am using for comparison has some
proprietary optimizations which make it a bit faster than normal.
However it still uses the old ipfw code, which is invoked when
fastfwd=0, and is significantly slower than the new one.

Now this really seems to call for adding this interface into the
mbuf subsystem. I believe we have to find a name for the allocator
(the deallocator might well go into m_freem(), depending on how we
implement the locking) and whether it makes sense to lock
mgethdr_cluster() under per-cpu locks or under Giant, or even let
the caller make sure that it holds the proper lock before trying
to invoke the procedure (as i expect the "producers" or "consumers"
of these pairs to be located in the network stack, chances are that
they already hold a lock on Giant).


        cheers
        luigi


The code:

        struct mbuf *em_pool;

        static int      em_pool_max = 50;
        SYSCTL_INT(_hw, OID_AUTO, em_pool_max, CTLFLAG_RW,
                &em_pool_max,0,"max size of mbuf pool");
        static int em_pool_now;
        SYSCTL_INT(_hw, OID_AUTO, em_pool_now, CTLFLAG_RD,
                &em_pool_now,0,"Current size of mbuf pool");

        ... in em_get_buf() ..

                if (em_pool) {
                        mp = em_pool;
                        em_pool = mp->m_nextpkt;
                        em_pool_now--;
                        goto have_it;
                }

        ... in em_clean_transmit_interrupts() ...
                if ((m = tx_buffer->m_head)) {
                        if (em_pool_now < em_pool_max &&
                            m->m_next == NULL &&
                            m->m_flags & M_EXT &&
                            M_WRITABLE(m) ) {
                                m->m_nextpkt = em_pool;
                                em_pool = m;
                                em_pool_now++;
                        } else
                                m_freem(m);
                        tx_buffer->m_head = NULL;
                }


On Sat, Jun 29, 2002 at 02:53:03PM -0700, Luigi Rizzo wrote:
> Hi,
> during some experiments i was doing recently, i noticed that there
> is a significant improvement in the forwarding speed (especially
> at very high speeds) if we keep a small pool of mbuf+cluster
> ready for use. This is because most network drivers do something
> like this
> 
>                 MGETHDR(m_new, M_DONTWAIT, MT_DATA);
>                 if (m_new == NULL)
>                         return(ENOBUFS);
> 
>                 MCLGET(m_new, M_DONTWAIT);
>                 if (!(m_new->m_flags & M_EXT)) {
>                         m_freem(m_new);
>                         return(ENOBUFS);
>                 }
> 
> when replenishing the receive buffers, and both macros are quite
> long even if there are available blocks in the free lists. We can
> store buffers of this form when/if they are released with some code
> like this:
> 
>       if (my_pool_count < my_pool_max && m->m_next == NULL &&
>                       (m->m_flags & M_EXT) && M_EXT_WRITABLE(m) ) {
>               m->m_nextpkt = my_pool;
>               m->m_data = ->m_ext.ext_buf;
>               m->m_len = m->m_pkthdr.len = MCLBYTES;
>               my_pool = m;
>               my_pool_now++;
>       } else {
>               ... rest of m_freem() ...
>       }
> 
> and save a lot of overhead (we just need to reset m_data and
> m_len and m_pkthdr.len) when someone wants to allocate them.
> 
> Is there interest in committing some code like this to
> mbuf.h and maybe uipc_mbuf*.c ?
> 
>       cheers
>       luigi
> 

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-net" in the body of the message

Mbuf allocator performance (was Should we keep a cache of mbuf+cluster ready for use ?)

Reply via email to