Hi, There is an inconsistency in the kernel mbuf code around the "m->m_ext.ext_size" field of an M_EXT mbuf.
At first glance you might assume that this field is the total amount of contiguous space available in the external buffer pointed to by "m->m_ext.ext_buf". For example, look at the M_TRAILINGSPACE() and MCLGET() macros. But alas, you know where assumptions lead... :-) Now look at m_split(), in particular this code: if (m->m_flags & M_EXT) { n->m_flags |= M_EXT; n->m_ext = m->m_ext; if(!m->m_ext.ext_ref) mclrefcnt[mtocl(m->m_ext.ext_buf)]++; else (*(m->m_ext.ext_ref))(m->m_ext.ext_buf, m->m_ext.ext_size); m->m_ext.ext_size = 0; /* For Accounting XXXXXX danger */ n->m_data = m->m_data + len; } else { Notice the 'XXXXXX' which is where the problem lies. The code is just recklessly setting m->m_ext.ext_size to zero to avoid some "accounting" problem. This has been there since revision 1.1/rgrimes. This is comletely broken and violates the semantics of the "m->m_ext.ext_size" field implied by M_TRAILINGSPACE() et. al. Moreover, I've also done a search of every occurrence of "ext_size" and everywhere else in the kernel treats this feild with the same "external buffer size" semantics, and there is no "accounting" that is done with this field. For an example of where this could cause problems, notice that a call to sbfree() on an mbuf that had gone through an m_split() could cause a "receive 1" panic (I've seen these occasionally over the years). FYI, m_split() is used in these files: netgraph/ng_ppp.c netinet6/ah_input.c netinet6/esp_input.c netinet6/frag6.c netsmb/smb_rq.c If there are no objections, I will apply the patch below. Thanks, -Archie __________________________________________________________________________ Archie Cobbs * Packet Design * http://www.packetdesign.com --- kern/uipc_mbuf.c.orig Fri May 31 11:17:52 2002 +++ kern/uipc_mbuf.c Fri May 31 11:27:42 2002 @@ -1194,6 +1194,10 @@ * Partition an mbuf chain in two pieces, returning the tail -- * all but the first len0 bytes. In case of failure, it returns NULL and * attempts to restore the chain to its original state. + * + * Note that the returned mbuf must be treated as read-only, because + * it will end up sharing an mbuf cluster with the original mbuf if the + * "breaking point" happens to lie within a cluster mbuf. */ struct mbuf * m_split(m0, len0, wait) @@ -1247,7 +1251,6 @@ else (*(m->m_ext.ext_ref))(m->m_ext.ext_buf, m->m_ext.ext_size); - m->m_ext.ext_size = 0; /* For Accounting XXXXXX danger */ n->m_data = m->m_data + len; } else { bcopy(mtod(m, caddr_t) + len, mtod(n, caddr_t), remain); To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-net" in the body of the message