On Wed, 30 May 2007 10:49:54 +0300 (EEST) "Ilpo Järvinen" <[EMAIL PROTECTED]> wrote:
> On Tue, 29 May 2007, Stephen Hemminger wrote: > > > On Tue, 29 May 2007 20:23:45 +0200 > > "Lior Dotan" <[EMAIL PROTECTED]> wrote: > > > > > NTP was not running. I'm not sure what do you mean by fixing the -1. > > > The trace shows that vegas_cong_avoid() is called with -1, and the > > > only way it can happen is from tcp_clean_rtx_queue() and the patch > > > should eliminate this. Another way of solving this is by checking > > > vegas_rtt_calc() and see if it gets -1 and handle it there. > > > Another thing that I don't understand is that some places like > > > tcp_ack() declare seq_rtt as signed and some vegas_cong_avoid() > > > declare it as unsigned. Shouldn't it be declared always as signed? > > > > > > On 5/29/07, Stephen Hemminger <[EMAIL PROTECTED]> wrote: > > > > On Tue, 29 May 2007 12:18:19 +0300 > > > > "Lior Dotan" <[EMAIL PROTECTED]> wrote: > > > > > > > > > Hi, > > > > > > > > > > I had a divide by zero on kernel 2.4.33 running with Vegas enabled. > > > > I don't think anyone has backported the other vegas fixes from 2.6.21 > > to 2.4. > > > > > > For 2.6.22, rtt_calc doesn't exist, instead pkts_acked is called. > > I tried to figure this one out yesterday, and it seems to me that divide > by zero should not occur with 2.6.22 (nor with 2.6.21 code), no matter > how I try to approach vegas... > > > The following should be added to avoid getting bogus timestamp values > > from retransmits. > > ...but this is unreliable timestamps problem is a real one that originates > from API merge commit 164891aadf1721fca4dce473bb0e0998181537c6 of yours > (see some though about it below). I suppose there are now similar concerns > about timestamp validity in other cc modules than vegas too. > > > --- a/net/ipv4/tcp_vegas.c 2007-05-02 12:26:35.000000000 -0700 > > +++ b/net/ipv4/tcp_vegas.c 2007-05-29 14:06:26.000000000 -0700 > > @@ -117,6 +117,10 @@ void tcp_vegas_pkts_acked(struct sock *s > > struct vegas *vegas = inet_csk_ca(sk); > > u32 vrtt; > > > > + /* Ignore retransmits */ > > + if (unlikely(cnt == 0)) > > + return; > > + > > /* Never allow zero rtt or baseRTT */ > > vrtt = ktime_to_us(net_timedelta(last)) + 1; > > ...I don't think this works, because cnt won't be zero ever because to > make the call some skbs were cleaned from the queue (isn't that what > pkts_acked stands for?). There is as if (acked&FLAG_ACKED) guard before > making the pkts_acked call to cc modules, thus delta in packets_out will > always be greater than 0. Hmm, now that I realize this, I would say that > checks for > 0 in pkts_acked are entirely bogus (in the current code), > hmm, that means I have more things to cleanup in tcp-2.6, at least, bic, > cubic and westwood do them... :-). > > I think the code did a right thing before your api merge, since it called > rtt callback only if FLAG_RETRANS_DATA_ACKED was not set (and pkts_acked > always), i.e., when TCP cannot know if it was the rexmission that > triggered the cumulative ACK or any original transmission, no new RTT > measurement should be made. Consider also the fact that, there might be a > non rexmitted skb after rexmitted one, which Lior's patch doesn't do > right way at all since the FLAG_DATA_ACKED would again get set (no > div-by-zero would have occured then but an unreliable timestamp would > have been givento vegas in 2.4). > > The number of packets that got cumulatively acked is never a right metric > to measure whether the cumulative ACK originated from rexmitted skbs or > not. ...Perhaps the FLAG_RETRANS_DATA_ACKED flag needs to be passed > somehow to cc modules? > > What about the mixed case where some retransmitted data and new data was covered by one ack. I would rather keep the merge, but think about the cases more carefully. -- Stephen Hemminger <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html