It doesn't help the OP at all, but this is why (thus far, anyway), I 
overwhelmingly prefer wavelength transport to anything switched. Can't have 
over-subscription or congestion issues on a wavelength. 




----- 
Mike Hammett 
Intelligent Computing Solutions 
http://www.ics-il.com 

Midwest-IX 
http://www.midwest-ix.com 

----- Original Message -----

From: "David Hubbard" <dhubb...@dino.hostasaurus.com> 
To: "Nanog@nanog.org" <nanog@nanog.org> 
Sent: Thursday, August 31, 2023 10:55:19 AM 
Subject: Lossy cogent p2p experiences? 



Hi all, curious if anyone who has used Cogent as a point to point provider has 
gone through packet loss issues with them and were able to successfully 
resolve? I’ve got a non-rate-limited 10gig circuit between two geographic 
locations that have about 52ms of latency. Mine is set up to support both jumbo 
frames and vlan tagging. I do know Cogent packetizes these circuits, so they’re 
not like waves, and that the expected single session TCP performance may be 
limited to a few gbit/sec, but I should otherwise be able to fully utilize the 
circuit given enough flows. 

Circuit went live earlier this year, had zero issues with it. Testing with 
common tools like iperf would allow several gbit/sec of TCP traffic using 
single flows, even without an optimized TCP stack. Using parallel flows or UDP 
we could easily get close to wire speed. Starting about ten weeks ago we had a 
significant slowdown, to even complete failure, of bursty data replication 
tasks between equipment that was using this circuit. Rounds of testing 
demonstrate that new flows often experience significant initial packet loss of 
several thousand packets, and will then have ongoing lesser packet loss every 
five to ten seconds after that. There are times we can’t do better than 50 
Mbit/sec, but it’s rare to achieve gigabit most of the time unless we do a 
bunch of streams with a lot of tuning. UDP we also see the loss, but can still 
push many gigabits through with one sender, or wire speed with several nodes. 

For equipment which doesn’t use a tunable TCP stack, such as storage arrays or 
vmware, the retransmits completely ruin performance or may result in ongoing 
failure we can’t overcome. 

Cogent support has been about as bad as you can get. Everything is great, clean 
your fiber, iperf isn’t a good test, install a physical loop oh wait we don’t 
want that so go pull it back off, new updates come at three to seven day 
intervals, etc. If the performance had never been good to begin with I’d have 
just attributed this to their circuits, but since it worked until late June, I 
know something has changed. I’m hoping someone else has run into this and maybe 
knows of some hints I could give them to investigate. To me it sounds like 
there’s a rate limiter / policer defined somewhere in the circuit, or an 
overloaded interface/device we’re forced to traverse, but they assure me this 
is not the case and claim to have destroyed and rebuilt the logical circuit. 

Thanks! 

Reply via email to