On Thu, Mar 31, 2016 at 9:06 PM, Jesse Gross <je...@kernel.org> wrote: > On Thu, Mar 31, 2016 at 2:30 PM, Pravin B Shelar <pshe...@ovn.org> wrote: >> STT implementation we saw performance improvements with linearizing >> skb for SLUB case. So following patch skips zero copy operation >> for such a case. >> >> Tested-By: Vasmi Abidi <vab...@vmware.com> >> Signed-off-by: Pravin B Shelar <pshe...@ovn.org> > > Can you add some performance numbers to the commit message? > ok.
>> diff --git a/datapath/linux/compat/stt.c b/datapath/linux/compat/stt.c >> index eb397e8..ae33d5e 100644 >> --- a/datapath/linux/compat/stt.c >> +++ b/datapath/linux/compat/stt.c >> static int coalesce_skb(struct sk_buff **headp) >> { >> - struct sk_buff *frag, *head, *prev; >> + struct sk_buff *frag, *head; >> +#ifndef SKIP_ZERO_COPY >> + struct sk_buff *prev; >> +#endif >> int err; >> >> err = straighten_frag_list(headp); > > I don't think that we need to straighten the frag list in the > SKIP_ZERO_COPY case. It's basically just undoing what the for loop > that forms the frag list will do. __skb_linearize() will be able to > handle it in any case. > I can skip straightening and work on skb with frag-list, but I do not want to complicate the code. This case is pretty rare after the change in reassemble where complete skb is allocated on first set segment. >> +#ifdef SKIP_ZERO_COPY >> + if (skb_shinfo(head)->frag_list) { >> + err = __skb_linearize(head); >> + return err; >> + } >> +#endif > > Maybe move this to try_to_segment()? It seems like it is a little more > consistent. > But it is not same. I think we only need to linearize of there is frag-list, AFAIK non linear data in skb_shinfo can be safely handled in skb-segment. >> static int segment_skb(struct sk_buff **headp, bool csum_partial, >> bool ipv4, bool tcp, int l4_offset) >> { >> +#ifndef SKIP_ZERO_COPY >> int err; >> >> err = coalesce_skb(headp); >> @@ -543,6 +575,7 @@ static int segment_skb(struct sk_buff **headp, bool >> csum_partial, >> if (skb_shinfo(*headp)->frag_list) >> return __try_to_segment(*headp, csum_partial, >> ipv4, tcp, l4_offset); >> +#endif >> return 0; >> } > > Is this OK? It will retain a skb with a frag_list on the transmit path > where this wasn't possible before. This used to cause kernel panics > since the skb geometry wasn't even when the packet went to be > segmented. I don't remember if this is still the case but if not then > it seems like we should be able to simply the code regardless of this > patch. > right, I missed it. I will keep the segmentation here. >> @@ -1093,6 +1141,15 @@ static struct sk_buff *reassemble(struct sk_buff *skb) >> >> frag = lookup_frag(dev_net(skb->dev), stt_percpu, &key, hash); >> if (!frag->skbs) { >> + int err; >> + >> + err = pskb_expand_head(skb, skb_headroom(skb), >> + skb->data_len + tot_len, GFP_ATOMIC); >> + if (likely(!err)) { >> + if (unlikely(!__pskb_pull_tail(skb, skb->data_len))) >> + BUG(); >> + } > > This linearizes the packet even in non-SKIP_ZERO_COPY cases. I guess > we probably don't want to do that. It's also possible that the first > skb received isn't necessarily the first packet in the reassembled > packet. > ok, I will fix it. > This is effectively optimizing for the case where all packets are > large and will generate a frag list after reassembly. However, it's > possible in many situations that packets can be reassembled with zero > copy that doesn't later need to be split (maybe the total packet is > around 20k). Do you know how the performance compares vs. just > linearizing at the end in that case? Is there a benefit to doing an > early copy? > We are not trying to coalesce skb in case of SKIP_ZERO_COPY, so any partial skb segment will generate multiple inner segments. I have seen better performance with this approach rather than linearizing at the end. >> @@ -1154,8 +1216,10 @@ static struct sk_buff *reassemble(struct sk_buff *skb) >> >> FRAG_CB(frag->skbs)->first.set_ecn_ce |= INET_ECN_is_ce(iph->tos); >> FRAG_CB(frag->skbs)->first.rcvd_len += skb->len; >> - FRAG_CB(frag->skbs)->first.mem_used += skb->truesize; >> - stt_percpu->frag_mem_used += skb->truesize; >> + if (!copied_skb) { >> + FRAG_CB(frag->skbs)->first.mem_used += skb->truesize; >> + stt_percpu->frag_mem_used += skb->truesize; >> + } > > pskb_expand_head() doesn't actually change the truesize so our count > will be more inaccurate than before. However, we don't have to worry > about the attack case of very small packets consuming a large amount > of memory due to having many copies of struct sk_buff. ok. I will fix true-size accounting. _______________________________________________ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev