13/05/2019 14:43, Olivier Matz: > On Wed, Apr 17, 2019 at 09:12:55AM +0100, Bruce Richardson wrote: > > On Tue, Apr 16, 2019 at 02:32:18PM -0400, Chas Williams wrote: > > > > > > > > > On 4/16/19 12:28 PM, Bruce Richardson wrote: > > > > On Tue, Apr 16, 2019 at 04:51:26PM +0100, Ferruh Yigit wrote: > > > > > The vlan_insert() is buggy when it tires to handle the shared mbufs, > > > > > instead don't support inserting VLAN tag into shared mbufs and return > > > > > an error for that case. > > > > > > > > > > Signed-off-by: Ferruh Yigit <ferruh.yi...@intel.com> > > > > > --- > > > > > Cc: Stephen Hemminger <step...@networkplumber.org> > > > > > Cc: Chas Williams <ch...@att.com> > > > > > > > > > > This is another approach to RFC to fix the vlan_insert: > > > > > https://patches.dpdk.org/patch/51870/ > > > > > > > > > > vlan_insert() mostly used by drivers to insert VLAN tag into packet > > > > > data in Tx path, drivers creating new copies of mbufs in Tx path may > > > > > result unexpected behavior, like not freed or double freed mbufs. > > > > > --- > > > > > lib/librte_net/rte_ether.h | 11 ++--------- > > > > > 1 file changed, 2 insertions(+), 9 deletions(-) > > > > > > > > > So what is the API to be used if one does want to insert a vlan tag > > > > into a > > > > shared mbuf? > > > > > > It's unlikely you would ever want to do that. Have one thread perform > > > some operation on the mbuf and other threads would expect this to have > > > happened? It seems counter to the way that packets might flow through an > > > application. Typically, you would insert the vlan and then share > > > the mbuf. Modifying a shared mbuf should make you ask, what are the > > > other copies expecting? > > > > > The thing is that the reference count only indicates the number of pointers > > to a buffer, it doesn't identify what parts are in use. So in the > > fragmentation case, there may only be one mbuf actually referencing the > > header part of the packet, with all other references to the memory being to > > other parts further in. However, point taken about how the app pipeline > > layout > > would probably make this issue unlikely. > > Yes, the difficulty here is that the condition > (!RTE_MBUF_DIRECT(*m) || rte_mbuf_refcnt_read(*m) > 1) > is not an exact equivalent of "the mbuf is writable". > > Of course, it the mbuf is direct and refcnt is 1, the mbuf is writable. > But we can imagine other cases where mbuf is writable. For instance, a > PMD that receives several packets in one big mbuf (with an appropriate > headroom for each), then create one indirect mbuf for each packet. > > We probably miss an API to express that the mbuf is writable. > > > > > Also, why is it such a problem to create new copies of data inside the > > > > driver if that is necessary? You create a copy and use that, freeing the > > > > original (i.e. in all likelyhood decrememting the ref-count since you no > > > > longer use it). You already have the pointer to the mbuf pool from the > > > > original buffer so you can get a copy from the same place. I'm curious > > > > to > > > > know why it would be impossible to do a functionally correct > > > > implementation? > > > > > > It is not an issue to do this correctly. Hemminger did submit a patch > > > that appeared to do this correctly (I haven't tested it). As mentioned > > > earlier the tricky part is returning the buffer to the application. If > > > you create a copy and transmit fails, you need to free that buffer or > > > return it to the application for it to free. If you free the buffer when > > > making a buffer, you certainly can't return it to the application for > > > it to be freed a second time. > > > > > Right. For transmit though, in most cases the only reason for failure is > > lack of space in a transmit ring, so most NIC drivers can be sure of > > success before cloning. > > > > Overall, it seems the consensus is that for real-world cases it's better to > > have this patch than not, so I'm ok for it to go into DPDK. > > Agree. > > Acked-by: Olivier Matz <olivier.m...@6wind.com>
Applied, sorry this patch was forgotten.