On Wed, Apr 22, 2015 at 7:48 PM, Jesse Gross <je...@nicira.com> wrote: > On Wed, Apr 22, 2015 at 3:38 PM, Pravin Shelar <pshe...@nicira.com> wrote: >> On Wed, Apr 22, 2015 at 3:24 PM, Jesse Gross <je...@nicira.com> wrote: >>> On Wed, Apr 22, 2015 at 3:19 PM, Pravin Shelar <pshe...@nicira.com> wrote: >>>> On Wed, Apr 22, 2015 at 11:31 AM, Jesse Gross <je...@nicira.com> wrote: >>>>> On Wed, Apr 22, 2015 at 11:09 AM, Pravin Shelar <pshe...@nicira.com> >>>>> wrote: >>>>>> On Wed, Apr 22, 2015 at 10:52 AM, Jesse Gross <je...@nicira.com> wrote: >>>>>>> On Wed, Apr 22, 2015 at 10:03 AM, Pravin Shelar <pshe...@nicira.com> >>>>>>> wrote: >>>>>>>> On Mon, Apr 20, 2015 at 9:54 PM, Jesse Gross <je...@nicira.com> wrote: >>>>>>>>> On Mon, Apr 20, 2015 at 7:33 PM, Pravin Shelar <pshe...@nicira.com> >>>>>>>>> wrote: >>>>>>>>>> On Mon, Apr 20, 2015 at 5:56 PM, Jesse Gross <je...@nicira.com> >>>>>>>>>> wrote: >>>>>>>>>>> On Mon, Apr 20, 2015 at 3:54 PM, Pravin Shelar <pshe...@nicira.com> >>>>>>>>>>> wrote: >>>>>>>>>>>> On Mon, Apr 20, 2015 at 3:36 PM, Jesse Gross <je...@nicira.com> >>>>>>>>>>>> wrote: >>>>>>>>>>>>> On Mon, Apr 20, 2015 at 1:29 PM, Pravin B Shelar >>>>>>>>>>>>> <pshe...@nicira.com> wrote: >>>>>>>>>>>>>> diff --git a/datapath/linux/compat/stt.c >>>>>>>>>>>>>> b/datapath/linux/compat/stt.c >>>>>>>>>>>>>> new file mode 100644 >>>>>>>>>>>>>> index 0000000..209bf1a >>>>>>>>>>>>>> --- /dev/null >>>>>>>>>>>>>> +++ b/datapath/linux/compat/stt.c >>>>>>>>>>>>>> +static void update_headers(struct sk_buff *skb, bool head, >>>>>>>>>>>>>> + unsigned int l4_offset, unsigned >>>>>>>>>>>>>> int hdr_len, >>>>>>>>>>>>>> + bool ipv4, u32 tcp_seq) >>>>>>>>>>>>> [...] >>>>>>>>>>>>>> + skb->truesize = SKB_TRUESIZE(skb_end_offset(skb)) + >>>>>>>>>>>>>> skb->data_len; >>>>>>>>>>>>>> +} >>>>>>>>>>>>> >>>>>>>>>>>>> I wonder if there are any possible edge cases with resetting >>>>>>>>>>>>> truesize >>>>>>>>>>>>> where the packet is still in someone's transmit queue (such as if >>>>>>>>>>>>> we >>>>>>>>>>>>> are looping back packet). Do we need to orphan it first? >>>>>>>>>>>>> >>>>>>>>>>>> ok, I will orphan it in update_headers. >>>>>>>>>>> >>>>>>>>>>> Just to clarify - I was mostly just thinking aloud on orphaning it. >>>>>>>>>>> I'm not totally sure if that is the right thing to do or if this is >>>>>>>>>>> the right place to do it. I'm not sure what the conceptual >>>>>>>>>>> justification would be for it and it could potentially result in the >>>>>>>>>>> sender's buffers not being properly limited. Perhaps not resetting >>>>>>>>>>> the >>>>>>>>>>> truesize is the right thing too... >>>>>>>>>>> >>>>>>>>>> I have seen warning msg if we do no keep truesize update along with >>>>>>>>>> changes to skb. >>>>>>>>> >>>>>>>>> Hmm, interesting, what is the warning? I don't think that I have seen >>>>>>>>> that before. >>>>>>>> >>>>>>>> Actually skb_try_coalesce() is updating it correctly. so there no need >>>>>>>> to change truesize anymore. I will update patch accordingly. >>>>>>> >>>>>>> That's much nicer. I also checked and other receive side code (like >>>>>>> TCP input) doesn't worry about the case where a local sender may still >>>>>>> be accounting for the packet since any type of loopback device does >>>>>>> call skb_orphan() in some form. >>>>>>> >>>>>>> I hate to bring this up but what about on transmit? In cases where we >>>>>>> merge or split skbs (skb_try_coalesce() and normalize_frag_list() >>>>>>> respectively) we do track the truesize for correctly for the result >>>>>>> but the individual pieces might not have the right destructors or >>>>>>> might not have their truesize updated for the destructor they do have. >>>>>> >>>>>> How about about we update merged skb stats (len, data_len, truesize) >>>>>> according to *delta_truesize we get from skb_try_coalesce() and then >>>>>> free the skb? >>>>> >>>>> I think that would work for the skb_try_coalesce() case (although I >>>>> would only worry about truesize, not the lengths). For >>>>> normalize_frag_list() I think we would either have to add a destructor >>>>> or not update truesize. I'm not sure that the condition where we have >>>>> frag_lists and a destructor actually ever happens in practice though. >>>> >>>> I am not sure why do we need a destructor for normalize_frag_list >>>> changes. Even with the changes in the function truesize is consistent >>>> for given skb memory usage. >>> >>> If you have a packet with a frag_list, all of the memory for the >>> individual elements will be accounted for in the truesize in the top >>> level skb. This skb could also be accounted to some socket and have a >>> destructor. When we break apart the list, we remove the truesize from >>> the head because the memory is now part of individual packets. >>> However, the destructor is presumably only on the head and so only its >>> memory will be removed from the socket accounting when it is freed but >>> not each of the other skbs that came from it. >> >> ok. In that case we can not have our own destructor since there is one >> already (I am not sure if we can use skb->cb to restore original). not >> changing true size can complicate skb coalesce, since it does update >> truesize. Easy option would be orphan skb if we are going to coalesce >> its fragments. > > I'm not sure that we need our own destructor. What do you think about > just replicating the original destructor onto each of the newly > generated skbs? > In that case we assume there is no state associated with the skb, that might not be always true.
> Completely unrelated but I noticed that there is one discrepancy > between this implementation and the STT protocol spec. The spec > recommends that the TCP window be set to zero but this sets it to > 0xffff. I only vaguely remember setting the window size to something > non-zero but we should check that and update one or the other. ok. let me check it. _______________________________________________ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev