Re: [9fans] fossil+venti performance question

2015-05-09 Thread erik quanstrom
> Looking at the first few bytes in each dir of the initial TCP
> handshake (with tcpdump) I see:
> 
> 0x:  4500 0030 24da   <= from plan9 to freebsd
> 
> 0x:  4500 0030 d249 4000  <= from freebsd to plan9
> 
> Looks like FreeBSD always sets the DF (don't fragment) bit
> (0x40 in byte 6), while plan9 doesn't (byte 6 is 0x00).
> 
> May be plan9 should set the DF (don't fragment) bit in the IP
> header and try to do path MTU discovery? Either by default or
> under some ctl option.

easy enough until one encounters devices that don't send icmp
responses because it's not implemented, or somehow considered
"secure" that way.  

- erik



Re: [9fans] fossil+venti performance question

2015-05-09 Thread cinap_lenrek
yes, but i was not refering to the adjusting which isnt changed here. only
the tcpmtu() call that got added.

yes, it *should* not make any difference but maybe we'r missing
something. at worst it makes the code more confusing and cause bugs in
the future because one of the initializations of mss is a lie without
any effect.

--
cinap



Re: [9fans] fossil+venti performance question

2015-05-09 Thread erik quanstrom
On Fri May  8 20:12:57 PDT 2015, cinap_len...@felloff.net wrote:
> do we really need to initialize tcb->mss to tcpmtu() in procsyn()?
> as i see it, procsyn() is called only when tcb->state is Syn_sent,
> which only should happen for client connections doing a connect, in
> which case tcpsndsyn() would have initialized tcb->mss already no?

i think there was a subtile reason for this, but i don't recall.  a real
reason for setting it here is because it makes the code easier to reason
about, imo.

there are a couple problems with the patch as it stands.  they are
inherited from previous mistakes.

* the setting of tpriv->stats[Mss] is bogus.  it's not shared between 
connections.
it is also v4 only.  

* so, mss should be added to each tcp connection's status file.

* the setting of tcb->mss in tcpincoming is not correct, tcp->mss is
set by SYN, not by ACK, and may not be reset.  (see snoopy below.)

* the SYN-ACK needs to send the local mss, not echo the remote mss.
asymmetry is "fine" in the other side, even if ip/tcp.c isn't smart enough to
keep tx and rx mss seperate.  (scare quotes = untested, there may be
some performance niggles if the sender is sending legal packets larger than
tcb->mss.)

my patch to nix is below.  i haven't submitted it yet.

- erik

---
005319 ms 
ether(s=a0369f1c3af7 d=0cc47a328da4 pr=0800 ln=62)
ip(s=10.1.1.8 d=10.1.1.9 id=ee54 frag= ttl=255 pr=6 ln=48)
tcp(s=38903 d=17766 seq=3552109414 ack=0 fl=S win=65535 ck=d68e ln=0 
opt4=(mss 1460) opt3=(wscale 4) opt=NOOP)
005320 ms 
ether(s=0cc47a328da4 d=a0369f1c3af7 pr=0800 ln=62)
ip(s=10.1.1.9 d=10.1.1.8 id=54d3 frag= ttl=255 pr=6 ln=48)
tcp(s=17766 d=38903 seq=441373010 ack=3552109415 fl=AS win=65535 
ck=eadc ln=0 opt4=(mss 1460) opt3=(wscale 4) opt=NOOP)

---

/n/dump/2015/0509/sys/src/nix/ip/tcp.c:491,501 - /sys/src/nix/ip/tcp.c:491,502
s = (Tcpctl*)(c->ptcl);
  
return snprint(state, n,
-   "%s qin %d qout %d rq %d.%d srtt %d mdev %d sst %lud cwin %lud 
swin %lud>>%d rwin %lud>>%d qscale %d timer.start %d timer.count %d rerecv %d 
katimer.start %d katimer.count %d\n",
+   "%s qin %d qout %d rq %d.%d mss %d srtt %d mdev %d sst %lud 
cwin %lud swin %lud>>%d rwin %lud>>%d qscale %d timer.start %d timer.count %d 
rerecv %d katimer.start %d katimer.count %d\n",
tcpstates[s->state],
c->rq ? qlen(c->rq) : 0,
c->wq ? qlen(c->wq) : 0,
s->nreseq, s->reseqlen,
+   s->mss,
s->srtt, s->mdev, s->ssthresh,
s->cwind, s->snd.wnd, s->rcv.scale, s->rcv.wnd, s->snd.scale,
s->qscale,
/n/dump/2015/0509/sys/src/nix/ip/tcp.c:843,854 - /sys/src/nix/ip/tcp.c:844,857
  
  /* mtu (- TCP + IP hdr len) of 1st hop */
  static int
- tcpmtu(Proto *tcp, uchar *addr, int version, uint *scale)
+ tcpmtu(Proto *tcp, uchar *addr, int version, uint reqmss, uint *scale)
  {
+   Tcppriv *tpriv;
Ipifc *ifc;
int mtu;
  
ifc = findipifc(tcp->f, addr, 0);
+   tpriv = tcp->priv;
switch(version){
default:
case V4:
/n/dump/2015/0509/sys/src/nix/ip/tcp.c:855,865 - /sys/src/nix/ip/tcp.c:858,870
mtu = DEF_MSS;
if(ifc != nil)
mtu = ifc->maxtu - ifc->m->hsize - (TCP4_PKT + 
TCP4_HDRSIZE);
+   tpriv->stats[Mss] = mtu;
break;
case V6:
mtu = DEF_MSS6;
if(ifc != nil)
mtu = ifc->maxtu - ifc->m->hsize - (TCP6_PKT + 
TCP6_HDRSIZE);
+   tpriv->stats[Mss] = mtu + (TCP6_PKT + TCP6_HDRSIZE) - (TCP4_PKT 
+ TCP4_HDRSIZE);
break;
}
/*
/n/dump/2015/0509/sys/src/nix/ip/tcp.c:868,873 - /sys/src/nix/ip/tcp.c:873,882
 */
*scale = Defadvscale;
  
+   /* our sending max segment size cannot be bigger than what he asked for 
*/
+   if(reqmss != 0 && reqmss < mtu) 
+   mtu = reqmss;
+ 
return mtu;
  }
  
/n/dump/2015/0509/sys/src/nix/ip/tcp.c:1300,1307 - 
/sys/src/nix/ip/tcp.c:1309,1314
  static void
  tcpsndsyn(Conv *s, Tcpctl *tcb)
  {
-   Tcppriv *tpriv;
- 
tcb->iss = (nrand(1<<16)<<16)|nrand(1<<16);
tcb->rttseq = tcb->iss;
tcb->snd.wl2 = tcb->iss;
/n/dump/2015/0509/sys/src/nix/ip/tcp.c:1314,1322 - 
/sys/src/nix/ip/tcp.c:1321,1327
tcb->sndsyntime = NOW;
  
/* set desired mss and scale */
-   tcb->mss = tcpmtu(s->p, s->laddr, s->ipversion, &tcb->scale);
-   tpriv = s->p->priv;
-   tpriv->stats[Mss] = tcb->mss;
+   tcb->mss = tcpmtu(s->p, s->laddr, s->ipversion, 0, &tcb->scale);
  }
  
  void
/n/dump/2015/0509/sys/src/nix/ip/tcp.c:1492,1498 - 
/sys/src/nix/ip/tcp.c:1497,1503
seg.ack = lp->irs+1;
seg.flags = SYN|ACK;
seg.urg = 0;
-   seg.mss = tcpmtu(tcp, lp->laddr, lp->version, &scale);
+   seg.mss = tcpmtu(tcp, 

Re: [9fans] fossil+venti performance question

2015-05-09 Thread erik quanstrom
On Fri May  8 20:12:57 PDT 2015, cinap_len...@felloff.net wrote:
> do we really need to initialize tcb->mss to tcpmtu() in procsyn()?
> as i see it, procsyn() is called only when tcb->state is Syn_sent,
> which only should happen for client connections doing a connect, in
> which case tcpsndsyn() would have initialized tcb->mss already no?

yes, we should.  the bug is that we confuse send mss and receive mss.
the sender's mss is the one we need to repsect here.
tcpsendsyn() should not set the mss, the mss it calculates is for rx.

- erik



Re: [9fans] fossil+venti performance question

2015-05-09 Thread Lyndon Nerenberg

On May 9, 2015, at 7:43 AM, erik quanstrom  wrote:

> easy enough until one encounters devices that don't send icmp
> responses because it's not implemented, or somehow considered
> "secure" that way.

Oddly enough, I don't see this 'problem' in the real world.  And FreeBSD is far 
from being alone in the always-set-DF bit.

The only place this bites is when you run into tiny shops with homegrown 
firewalls configured by people who don't understand networking or security.  
Me, I consider it a feature that these sites self-select themselves off the 
network.  I'm certainly no worse off for not being able to talk to them.


signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: [9fans] fossil+venti performance question

2015-05-09 Thread Devon H. O'Dell
2015-05-09 10:25 GMT-07:00 Lyndon Nerenberg :
>
>
> On May 9, 2015, at 7:43 AM, erik quanstrom  wrote:
>
> > easy enough until one encounters devices that don't send icmp
> > responses because it's not implemented, or somehow considered
> > "secure" that way.
>
> Oddly enough, I don't see this 'problem' in the real world.  And FreeBSD is 
> far from being alone in the always-set-DF bit.
>
> The only place this bites is when you run into tiny shops with homegrown 
> firewalls configured by people who don't understand networking or security.  
> Me, I consider it a feature that these sites self-select themselves off the 
> network.  I'm certainly no worse off for not being able to talk to them.

Or when your client is on a cell phone. Cell networks are the worst.



Re: [9fans] fossil+venti performance question

2015-05-09 Thread Lyndon Nerenberg

On May 9, 2015, at 10:30 AM, Devon H. O'Dell  wrote:

> Or when your client is on a cell phone. Cell networks are the worst.

Really?  Quite often I slave my laptop to my phone's LTE connection, and I 
never have problems with PMTU.  Both here (across western Canada) and in the UK.



signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: [9fans] fossil+venti performance question

2015-05-09 Thread Bakul Shah


> On May 9, 2015, at 10:25 AM, Lyndon Nerenberg  wrote:
> 
> 
>> On May 9, 2015, at 7:43 AM, erik quanstrom  wrote:
>> 
>> easy enough until one encounters devices that don't send icmp
>> responses because it's not implemented, or somehow considered
>> "secure" that way.
> 
> Oddly enough, I don't see this 'problem' in the real world.  And FreeBSD is 
> far from being alone in the always-set-DF bit.
> 
> The only place this bites is when you run into tiny shops with homegrown 
> firewalls configured by people who don't understand networking or security.  
> Me, I consider it a feature that these sites self-select themselves off the 
> network.  I'm certainly no worse off for not being able to talk to them.

Network admins not understanding ICMP was far more common 20 years ago. Now the 
game has changed. At any rate no harm in trying PMTU discovery as an option 
(other than a SMOP).


Re: [9fans] fossil+venti performance question

2015-05-09 Thread Devon H. O'Dell
2015-05-09 10:35 GMT-07:00 Lyndon Nerenberg :
>
> On May 9, 2015, at 10:30 AM, Devon H. O'Dell  wrote:
>
>> Or when your client is on a cell phone. Cell networks are the worst.
>
> Really?  Quite often I slave my laptop to my phone's LTE connection, and I 
> never have problems with PMTU.  Both here (across western Canada) and in the 
> UK.

There are lots of hacks all over the Internet to deal with various
brokenness on the carrier<->carrier side of things where one end is a
cell network. Haven't seen anything come up super recently, but had to
help debug some brokenness as recently as a year and a half ago that
turned out to be some cell network with really old hardware that
didn't do PMTU correctly, causing TLS connections to drop or die. IIRC
this particular case was in France, but I also seem to recall the same
issue in northern England and perhaps Ireland.



Re: [9fans] fossil+venti performance question

2015-05-09 Thread erik quanstrom
for what it's worth, the original newreno work tcp does not have the mtu
bug.  on a 8 processor system i have around here i get

bwc; while() nettest -a 127.1
tcp!127.0.0.1!40357 count 10; 81920 bytes in 1.505948 s @ 519 MB/s (0ms)
tcp!127.0.0.1!47983 count 10; 81920 bytes in 1.377984 s @ 567 MB/s (0ms)
tcp!127.0.0.1!53197 count 10; 81920 bytes in 1.299967 s @ 601 MB/s (0ms)
tcp!127.0.0.1!61569 count 10; 81920 bytes in 1.418073 s @ 551 MB/s (0ms)

however, after fixing things so the initial cwind isn't hosed, i get a little 
better story:

bwc; while() nettest -a 127.1
tcp!127.0.0.1!54261 count 10; 81920 bytes in .5947659 s @ 1.31e+03 MB/s 
(0ms)

boo yah!  not bad for trying to clean up some constants.

- erik



Re: [9fans] fossil+venti performance question

2015-05-09 Thread erik quanstrom
> however, after fixing things so the initial cwind isn't hosed, i get a little 
> better story:

so, actually, i think this is the root cause.  the intial cwind is misset for 
loopback.
i but that the symptom folks will see is that /net/tcp/stats shows 
fragmentation when
performance sucks.  evidently there is a backoff bug in sources' tcp, too.

i'd love confirmation of this.  

- erik