Some interesting new developments on this, independent of the divergent network 
equipment discussion. šŸ˜Š

Cogent had a field engineer at the east coast location where my local loop 
(10gig wave) meets their equipment, i.e. (me ā€“ patch cable to loop providerā€™s 
wave equipment ā€“ wave ā€“ patch cable to Cogent equipment).  On the other end, 
the geographically distant west coast direction, itā€™s Cogent equipment to my 
equipment in the same facility with just patch cable.  They connected some 
model of EXFOā€™s NetBlazer FTBx 8880-series testing device to a port on their 
east coast network device, not disconnecting my circuit.  Originally, they were 
planning to have someone physically loop at their equipment at the other end, 
but I volunteered that my Arista gear supports a provider-facing loop at the 
transceiver level if they wanted to try that, so my loop, cabling, and 
transceiver could be part of the testing.

One direction at a time, they interrupted the point to point config to create a 
point to point between one direction of my gear, set to loopback mode, and the 
NetBlazer device.  The device was set to use five parallel streams.  In the 
close direction, where the third-party wave is involved, they ran at full 5 x 
2gbps for thirty minutes, had zero packets lost, no issues.  My monitoring 
confirmed this rate of port input was occurring, although oddly not output, but 
perhaps Arista doesnā€™t ā€œseeā€/count the retransmitted packets in phy loopback 
mode.

In the distant direction across their backbone, their equipment at the remote 
end, and the fiber patch cable to me, they tested at 9.5 Gbit for thirty 
minutes through my device in loopback mode.  The result was, of 2.6B packets 
sent, only 334 packets lost.  They configured for 9.5 gbps rate of testing, so 
five 1.9gbps streams.  Across the five streams, the report has a ā€œframe lossā€ 
and out of sequence section.  Zero out of sequence, but among the five streams, 
loss seconds / count were 3 / 26, 3 / 48, 1 / 5, 13 / 221, 1 / 34.  Iā€™m not 
familiar with this testing device, but to me that suggests itā€™s stating how 
many of the total seconds experienced loss, and the counted packet loss.  So 
really the only one that stands out is the one with thirteen seconds where loss 
occurred, but the packet counts weā€™re talking about are miniscule.  Again, my 
monitoring at the interface level showed this 9.5gbps of testing occurring for 
the thirty minutes the report says.

So, now Iā€™m just completely confused.  How is this device, traversing the same 
equipment, ports, cables, able to achieve far greater average throughput, and 
almost no loss, across a very long duration?  There are times Iā€™ll be able to 
achieve nearly the same, but never for a test longer than ten seconds as it 
just falls off from there.  For example, I did a five parallel stream TCP test 
with iperf just now and did achieve a net throughput of 8.16 Gbps with about 
1200 retransmits.  Same five stream test run for half hour like theirs, I got 
no better than 2.64 Gbps and 183,000 retransmits.

iperf and UDP allow me to see loss at any rate of transmit exceeding ~140mbps, 
in just seconds, not a half hour.  To rule out my gear, Iā€™m also able to 
perform the same tests from the same systems (both VM and physical) using 
public addresses and traversing the internet, as these are publicly connected 
systems.  I get far lower loss and much greater throughput on the internet 
path.  For example, simple ten second test of a single stream at 400 Mbit UDP; 
5 packets lost across internet, 491 across P2P.  Single stream TCP across the 
internet for ten seconds; 3.47 Gbps, 162 retransmits.  Across the P2P, this 
time at least, 637 Mbps, 3633 retransmits.

David



From: David Hubbard <dhubb...@dino.hostasaurus.com>
Date: Friday, September 1, 2023 at 10:19 AM
To: Nanog@nanog.org <nanog@nanog.org>
Subject: Re: Lossy cogent p2p experiences?
The initial and recurring packet loss occurs on any flow of more than ~140 
Mbit.  The fact that itā€™s loss-free under that rate is what furthers my opinion 
itā€™s config-based somewhere, even though they say it isnā€™t.

From: NANOG <nanog-bounces+dhubbard=dino.hostasaurus....@nanog.org> on behalf 
of Mark Tinka <mark@tinka.africa>
Date: Friday, September 1, 2023 at 10:13 AM
To: Mike Hammett <na...@ics-il.net>, Saku Ytti <s...@ytti.fi>
Cc: nanog@nanog.org <nanog@nanog.org>
Subject: Re: Lossy cogent p2p experiences?

On 9/1/23 15:44, Mike Hammett wrote:
and I would say the OP wasn't even about elephant flows, just about a network 
that can't deliver anything acceptable.

Unless Cogent are not trying to accept (and by extension, may not be able to 
guarantee) large Ethernet flows because they can't balance them across their 
various core links, end-to-end...

Pure conjecture...

Mark.

Reply via email to