Hi,

There is an inconsistency in the kernel mbuf code around the
"m->m_ext.ext_size" field of an M_EXT mbuf.

At first glance you might assume that this field is the total amount
of contiguous space available in the external buffer pointed to by
"m->m_ext.ext_buf". For example, look at the M_TRAILINGSPACE() and
MCLGET() macros. But alas, you know where assumptions lead... :-)

Now look at m_split(), in particular this code:

    if (m->m_flags & M_EXT) {
        n->m_flags |= M_EXT;
        n->m_ext = m->m_ext;
        if(!m->m_ext.ext_ref)
            mclrefcnt[mtocl(m->m_ext.ext_buf)]++;
        else
            (*(m->m_ext.ext_ref))(m->m_ext.ext_buf,  
                                    m->m_ext.ext_size);
        m->m_ext.ext_size = 0; /* For Accounting XXXXXX danger */
        n->m_data = m->m_data + len;
    } else {

Notice the 'XXXXXX' which is where the problem lies. The code is
just recklessly setting m->m_ext.ext_size to zero to avoid some
"accounting" problem. This has been there since revision 1.1/rgrimes.

This is comletely broken and violates the semantics of the
"m->m_ext.ext_size" field implied by M_TRAILINGSPACE() et. al.

Moreover, I've also done a search of every occurrence of "ext_size"
and everywhere else in the kernel treats this feild with the same
"external buffer size" semantics, and there is no "accounting" that
is done with this field.

For an example of where this could cause problems, notice that
a call to sbfree() on an mbuf that had gone through an m_split()
could cause a "receive 1" panic (I've seen these occasionally over
the years).

FYI, m_split() is used in these files:

    netgraph/ng_ppp.c
    netinet6/ah_input.c
    netinet6/esp_input.c
    netinet6/frag6.c
    netsmb/smb_rq.c

If there are no objections, I will apply the patch below.

Thanks,
-Archie

__________________________________________________________________________
Archie Cobbs     *     Packet Design     *     http://www.packetdesign.com

--- kern/uipc_mbuf.c.orig       Fri May 31 11:17:52 2002
+++ kern/uipc_mbuf.c    Fri May 31 11:27:42 2002
@@ -1194,6 +1194,10 @@
  * Partition an mbuf chain in two pieces, returning the tail --
  * all but the first len0 bytes.  In case of failure, it returns NULL and
  * attempts to restore the chain to its original state.
+ *
+ * Note that the returned mbuf must be treated as read-only, because
+ * it will end up sharing an mbuf cluster with the original mbuf if the
+ * "breaking point" happens to lie within a cluster mbuf.
  */
 struct mbuf *
 m_split(m0, len0, wait)
@@ -1247,7 +1251,6 @@
                else
                        (*(m->m_ext.ext_ref))(m->m_ext.ext_buf,
                                                m->m_ext.ext_size);
-               m->m_ext.ext_size = 0; /* For Accounting XXXXXX danger */
                n->m_data = m->m_data + len;
        } else {
                bcopy(mtod(m, caddr_t) + len, mtod(n, caddr_t), remain);

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-net" in the body of the message

Reply via email to