> > Cogent support has been about as bad as you can get. Everything is great, > clean your fiber, iperf isn’t a good test, install a physical loop oh wait > we don’t want that so go pull it back off, new updates come at three to > seven day intervals, etc. If the performance had never been good to begin > with I’d have just attributed this to their circuits, but since it worked > until late June, I know something has changed. I’m hoping someone else has > run into this and maybe knows of some hints I could give them to > investigate. To me it sounds like there’s a rate limiter / policer defined > somewhere in the circuit, or an overloaded interface/device we’re forced to > traverse, but they assure me this is not the case and claim to have > destroyed and rebuilt the logical circuit. >
Sure smells like port buffer issues somewhere in the middle. ( mismatched deep / shallow, or something configured to support jumbo frames, but buffers not optimized for them) On Thu, Aug 31, 2023 at 11:57 AM David Hubbard < dhubb...@dino.hostasaurus.com> wrote: > Hi all, curious if anyone who has used Cogent as a point to point provider > has gone through packet loss issues with them and were able to successfully > resolve? I’ve got a non-rate-limited 10gig circuit between two geographic > locations that have about 52ms of latency. Mine is set up to support both > jumbo frames and vlan tagging. I do know Cogent packetizes these circuits, > so they’re not like waves, and that the expected single session TCP > performance may be limited to a few gbit/sec, but I should otherwise be > able to fully utilize the circuit given enough flows. > > > > Circuit went live earlier this year, had zero issues with it. Testing > with common tools like iperf would allow several gbit/sec of TCP traffic > using single flows, even without an optimized TCP stack. Using parallel > flows or UDP we could easily get close to wire speed. Starting about ten > weeks ago we had a significant slowdown, to even complete failure, of > bursty data replication tasks between equipment that was using this > circuit. Rounds of testing demonstrate that new flows often experience > significant initial packet loss of several thousand packets, and will then > have ongoing lesser packet loss every five to ten seconds after that. > There are times we can’t do better than 50 Mbit/sec, but it’s rare to > achieve gigabit most of the time unless we do a bunch of streams with a lot > of tuning. UDP we also see the loss, but can still push many gigabits > through with one sender, or wire speed with several nodes. > > > > For equipment which doesn’t use a tunable TCP stack, such as storage > arrays or vmware, the retransmits completely ruin performance or may result > in ongoing failure we can’t overcome. > > > > Cogent support has been about as bad as you can get. Everything is great, > clean your fiber, iperf isn’t a good test, install a physical loop oh wait > we don’t want that so go pull it back off, new updates come at three to > seven day intervals, etc. If the performance had never been good to begin > with I’d have just attributed this to their circuits, but since it worked > until late June, I know something has changed. I’m hoping someone else has > run into this and maybe knows of some hints I could give them to > investigate. To me it sounds like there’s a rate limiter / policer defined > somewhere in the circuit, or an overloaded interface/device we’re forced to > traverse, but they assure me this is not the case and claim to have > destroyed and rebuilt the logical circuit. > > > > Thanks! >