On Sun May 10 14:36:15 PDT 2015, cinap_len...@felloff.net wrote:
> how is this the opposite? your patch shows the tcb->mss init being removed
> completely from tcpincoming().
>
> - /* our sending max segment size cannot be bigger than what he asked for
> */
> - if(lp->mss != 0 && lp->ms
how is this the opposite? your patch shows the tcb->mss init being removed
completely from tcpincoming().
- /* our sending max segment size cannot be bigger than what he asked for
*/
- if(lp->mss != 0 && lp->mss < tcb->mss) {
- tcb->mss = lp->mss;
- tpriv
> 2.a) tcpiput() gets a ACK packet for Listening connection, calls
> tcpincoming().
> 2.b) tcpincoming() looks in limbo, finds lp. and makes new connection.
> 3.c) initialize our connections tcb->mss.
>
> > * the setting of tcb->mss in tcpincoming is not correct, tcp->mss is
> > set by SYN, not b
On Sun May 10 10:58:55 PDT 2015, 0in...@gmail.com wrote:
> >> however, after fixing things so the initial cwind isn't hosed, i get a
> >> little better story:
> >
> > so, actually, i think this is the root cause. the intial cwind is misset
> > for loopback.
> > i but that the symptom folks will
> * the SYN-ACK needs to send the local mss, not echo the remote mss.
> asymmetry is "fine" in the other side, even if ip/tcp.c isn't smart enough to
> keep tx and rx mss seperate. (scare quotes = untested, there may be
> some performance niggles if the sender is sending legal packets larger than
>> however, after fixing things so the initial cwind isn't hosed, i get a
>> little better story:
>
> so, actually, i think this is the root cause. the intial cwind is misset for
> loopback.
> i but that the symptom folks will see is that /net/tcp/stats shows
> fragmentation when
> performance
> however, after fixing things so the initial cwind isn't hosed, i get a little
> better story:
so, actually, i think this is the root cause. the intial cwind is misset for
loopback.
i but that the symptom folks will see is that /net/tcp/stats shows
fragmentation when
performance sucks. evide
for what it's worth, the original newreno work tcp does not have the mtu
bug. on a 8 processor system i have around here i get
bwc; while() nettest -a 127.1
tcp!127.0.0.1!40357 count 10; 81920 bytes in 1.505948 s @ 519 MB/s (0ms)
tcp!127.0.0.1!47983 count 10; 81920 bytes in 1.3779
2015-05-09 10:35 GMT-07:00 Lyndon Nerenberg :
>
> On May 9, 2015, at 10:30 AM, Devon H. O'Dell wrote:
>
>> Or when your client is on a cell phone. Cell networks are the worst.
>
> Really? Quite often I slave my laptop to my phone's LTE connection, and I
> never have problems with PMTU. Both her
> On May 9, 2015, at 10:25 AM, Lyndon Nerenberg wrote:
>
>
>> On May 9, 2015, at 7:43 AM, erik quanstrom wrote:
>>
>> easy enough until one encounters devices that don't send icmp
>> responses because it's not implemented, or somehow considered
>> "secure" that way.
>
> Oddly enough, I don'
On May 9, 2015, at 10:30 AM, Devon H. O'Dell wrote:
> Or when your client is on a cell phone. Cell networks are the worst.
Really? Quite often I slave my laptop to my phone's LTE connection, and I
never have problems with PMTU. Both here (across western Canada) and in the UK.
signature.as
2015-05-09 10:25 GMT-07:00 Lyndon Nerenberg :
>
>
> On May 9, 2015, at 7:43 AM, erik quanstrom wrote:
>
> > easy enough until one encounters devices that don't send icmp
> > responses because it's not implemented, or somehow considered
> > "secure" that way.
>
> Oddly enough, I don't see this 'pro
On May 9, 2015, at 7:43 AM, erik quanstrom wrote:
> easy enough until one encounters devices that don't send icmp
> responses because it's not implemented, or somehow considered
> "secure" that way.
Oddly enough, I don't see this 'problem' in the real world. And FreeBSD is far
from being alon
On Fri May 8 20:12:57 PDT 2015, cinap_len...@felloff.net wrote:
> do we really need to initialize tcb->mss to tcpmtu() in procsyn()?
> as i see it, procsyn() is called only when tcb->state is Syn_sent,
> which only should happen for client connections doing a connect, in
> which case tcpsndsyn() w
On Fri May 8 20:12:57 PDT 2015, cinap_len...@felloff.net wrote:
> do we really need to initialize tcb->mss to tcpmtu() in procsyn()?
> as i see it, procsyn() is called only when tcb->state is Syn_sent,
> which only should happen for client connections doing a connect, in
> which case tcpsndsyn() w
yes, but i was not refering to the adjusting which isnt changed here. only
the tcpmtu() call that got added.
yes, it *should* not make any difference but maybe we'r missing
something. at worst it makes the code more confusing and cause bugs in
the future because one of the initializations of mss i
> Looking at the first few bytes in each dir of the initial TCP
> handshake (with tcpdump) I see:
>
> 0x: 4500 0030 24da <= from plan9 to freebsd
>
> 0x: 4500 0030 d249 4000 <= from freebsd to plan9
>
> Looks like FreeBSD always sets the DF (don't fragment) bit
>
> do we really need to initialize tcb->mss to tcpmtu() in procsyn()?
> as i see it, procsyn() is called only when tcb->state is Syn_sent,
> which only should happen for client connections doing a connect, in
> which case tcpsndsyn() would have initialized tcb->mss already no?
tcb->mss may still ne
do we really need to initialize tcb->mss to tcpmtu() in procsyn()?
as i see it, procsyn() is called only when tcb->state is Syn_sent,
which only should happen for client connections doing a connect, in
which case tcpsndsyn() would have initialized tcb->mss already no?
--
cinap
On Fri, 08 May 2015 21:24:13 +0200 David du Colombier <0in...@gmail.com> wrote:
> On the loopback medium, I suppose this is the opposite issue.
> Since the TCP stack didn't fix the MSS in the incoming
> connection, the programs sent multiple small 1500 bytes
> IP packets instead of large 16384 IP p
I confirm - my old performance is back.
Thanks very much David.
-Steve
I've finally figured out the issue.
The slowness issue only appears on the loopback, because
it provides a 16384 MTU.
There is an old bug in the Plan 9 TCP stack, were the TCP
MSS doesn't take account the MTU for incoming connections.
I originally fixed this issue in January 2015 for the Plan 9
> oh. possibly the queue isn't big enough, given the window size.
> it's using qpass on a Queue with Qmsg and if the queue is full,
> Blocks will be discarded.
I tried to increase the size of the queue, but no luck.
--
David du Colombier
On 8 May 2015 at 17:13, David du Colombier <0in...@gmail.com> wrote:
> Also, the issue is definitely related to the loopback.
> There is no problem when using an address on /dev/ether0.
>
oh. possibly the queue isn't big enough, given the window size. it's using
qpass on a Queue with Qmsg
and if
I've enabled tcp, tcpwin and tcprxmt logs, but there isn't
anything very interesting.
tcpincoming s 127.0.0.1!53150/127.0.0.1!53150 d
127.0.0.1!17034/127.0.0.1!17034 v 4/4
Also, the issue is definitely related to the loopback.
There is no problem when using an address on /dev/ether0.
cpu% cat /n
> cpu% cat /net/tcp/3/local
> 127.0.0.1!57796
> cpu% cat /net/tcp/3/remote
> 127.0.0.1!17034
> cpu% cat /net/tcp/3/status
> Established qin 0 qout 0 rq 0.0 srtt 80 mdev 40 sst 1048560 cwin
> 258192 swin 1048560>>4 rwin 1048560>>4 qscale 4 timer.start 10
> timer.count 10 rerecv 0 katimer.start 2400
> NOW is defined as MACHP(0)->ticks, so this is a pretty course timer
> that can't go backwards on intel processors. this limits the timer's
> resolution to HZ,
> which on 9atom is 1000, and 100 on pretty much anything else. further
> limiting the
> resolution is the tcp retransmit timers which
On Tue May 5 15:54:45 PDT 2015, ara...@mgk.ro wrote:
> It's pretty interesting that at least three people all got exactly
> 150kB/s on vastly different machines, both real and virtual. Maybe the
> number comes from some tick frequency?
i might suggest altering HZ and seeing if there is a throughp
On Wed May 6 14:28:03 PDT 2015, 0in...@gmail.com wrote:
> I got it!
>
> The regression was caused by the NewReno TCP
> change on 2013-01-24.
>
> https://github.com/0intro/plan9/commit/e8406a2f44
if you have proof, i'd be interested in reproduction of the issue from the
original source, or
perh
On Wed May 6 15:30:24 PDT 2015, charles.fors...@gmail.com wrote:
> On 6 May 2015 at 22:28, David du Colombier <0in...@gmail.com> wrote:
>
> > Since the problem only happen when Fossil or vacfs are running
> > on the same machine as Venti, I suppose this is somewhat related
> > to how TCP behaves
On 6 May 2015 at 23:35, Steven Stallion wrote:
> Were these the changes that erik submitted?
I don't think so. Someone else submitted a different set of tcp changes
independently much earlier.
Definitely interesting, and explains why I've never seen the regression (I
switched to a dedicated venti server a couple of years ago). Were these the
changes that erik submitted? ISTR him working on reno bits somewhere around
there...
On Wed, May 6, 2015 at 4:28 PM, David du Colombier <0in...@gma
On 6 May 2015 at 22:28, David du Colombier <0in...@gmail.com> wrote:
> Since the problem only happen when Fossil or vacfs are running
> on the same machine as Venti, I suppose this is somewhat related
> to how TCP behaves with the loopback.
>
Interesting. That would explain the clock-like delays.
Since the problem only happen when Fossil or vacfs are running
on the same machine as Venti, I suppose this is somewhat related
to how TCP behaves with the loopback.
--
David du Colombier
I got it!
The regression was caused by the NewReno TCP
change on 2013-01-24.
https://github.com/0intro/plan9/commit/e8406a2f44
--
David du Colombier
On 6 May 2015 at 21:55, David du Colombier <0in...@gmail.com> wrote:
> However, now I'm sure the issue was caused by a kernel
> change in 2013.
>
> There is no problem when running a kernel from early 2013.
>
Welly, welly, welly, well. That is interesting.
Just to be sure, I tried again, and the issue is not related
to the lock change on 2013-09-19.
However, now I'm sure the issue was caused by a kernel
change in 2013.
There is no problem when running a kernel from early 2013.
--
David du Colombier
It's pretty interesting that at least three people all got exactly
150kB/s on vastly different machines, both real and virtual. Maybe the
number comes from some tick frequency?
--
Aram Hăvărneanu
Yes, I'm pretty sure it's not related to Fossil, since it happens with
vacfs as well.
Also, Venti was pretty much unchanged during the last few years.
I suspected it was related to the lock change on 2013-09-19.
https://github.com/0intro/plan9/commit/c4d045a91e
But I remember I tried to revert t
semlocks?
anyway, should not be too hard to figure out with /n/dump
--
cinap
On 5 May 2015 at 16:38, David du Colombier <0in...@gmail.com> wrote:
> > How many times do you time it on each machine?
>
> Maybe ten times. The results are always the same ~5%.
> Also, I restarted vacfs between each try.
It was the effect of the ram caches that prompted the question.
My experi
> I too see this, and feel, no proof, that things used to be better. I.e. the
> first time I read a file from venti it it very, very slow. subsequent reads
> from the ram cache are quick.
>
> I think venti used to be faster a few years ago. maybe another effect of this
> is the boot time seems s
I too see this, and feel, no proof, that things used to be better. I.e. the
first time I read a file from venti it it very, very slow. subsequent reads
from the ram cache are quick.
I think venti used to be faster a few years ago. maybe another effect of this
is the boot time seems slower than
>> I've just made some measurements when reading a file:
>>
>> Vacfs running on the same machine as Venti: 151 KB/s
>> Vacfs running on another machine: 5131 KB/s
>
>
> How many times do you time it on each machine?
Maybe ten times. The results are always the same ~5%.
Also, I restarted vacfs betw
Thanks Aram.
> I have spent some time
> debugging this, but unfortunately, I couldn't find the root cause, and
> I just stopped using fossil.
I tried to measure performance effect by replacement of component.
1) mbr or GRUB
2) pbs or pbslba
3) sdata or sdvirtio (sdvirtio is imported from 9legacy
On 4 May 2015 at 19:51, David du Colombier <0in...@gmail.com> wrote:
>
> I've just made some measurements when reading a file:
>
> Vacfs running on the same machine as Venti: 151 KB/s
> Vacfs running on another machine: 5131 KB/s
How many times do you time it on each machine?
Thanks Anthony.
> I bet if you re-run the same test twice in a
> row, you’re going to see dramatically improved
> performance.
I try to re-run ‘iostats md5sum /386/9pcf’.
Read result is very fast.
first read result is 152KB/s.
second read result is 232MB/s.
> Your write performance in that test
Hello!
imho placing fossil, venti, isect, bloom and swap on single drive is bad
idea.
As written in in http://plan9.bell-labs.com/sys/doc/venti/venti.html - "The
prototype Venti server is implemented for the Plan 9 operating system in
about 10,000 lines of C. The server runs on a dedicated dual 55
I'm experiencing the same issue as well.
When I launch vacfs on the same machine as Venti,
reading is very slow. When I launch vacfs on another
Plan 9 or Unix machine, reading is fast.
I've just made some measurements when reading a file:
Vacfs running on the same machine as Venti: 151 KB/s
Vacf
I have seen the same problem a few years back on about half of my
machines. The other half were fine. There was a 1000x difference in
performance between the good and bad machines. I have spent some time
debugging this, but unfortunately, I couldn't find the root cause, and
I just stopped using fos
The reason, in general:
In a fossil+venti setup, fossil runs (basically) as a
cache for venti. If your access just hits fossil, it’ll
be quick; if not, you hit the (significantly slower)
venti. I bet if you re-run the same test twice in a
row, you’re going to see dramatically improved
performance.
Hello, fans.
I’m running Plan 9(labs) on public QEMU/KVM service.
My Plan 9 system has a slow read performance problem.
I ran 'iostats md5sum /386/9pcf’, DMA is on, read result is 150KB/s.
but write performance is fast.
My Plan 9 system has a 200GB HDD, formatted with fossil+venti.
disk layout is
52 matches
Mail list logo