We have a moderately dense deployment of 100-Gig LR4 (Both DWDM Lambdas and 
Juniper MX) around our WAN and we don't clock any background input errors on 
our interfaces unless there is an ongoing problem.  That said, we have 
experienced issues with sub-millisecond link state changes between two 
endpoints that are physically cross connected to one another with no 
intermediary Layer 1 (DWDM, Etc.).  There doesn't seem to be rhyme or reason to 
this and we've looked at each lane extensively and so far, everything has been 
inconclusive.  We also experienced some code issues on Juniper MPC3D-NG's 
running 100-Gig's and our DWDM Client Ports where timing would start to slip 
and eventually cause the link to fail.  Both Juniper and the DWDM Vendor found 
code variances they patched.  We haven't had any such issues on Juniper MPC5's 
7's or the 10003 Line Cards.

TL;DR:  In my experience, 100-Gig might require some more TLC then 10-Gig to 
run clean and is more sensitive to variations in transport.  Other's mileage 
may vary.

Best,
JJ Stonebraker  |  Associate Director
The University of Texas System | Office of Telecommunication Services
(512) 232-0888  | j...@ots.utsystem.edu

________________________________
From: NANOG <nanog-bounces+jjs=ots.utsystem....@nanog.org> on behalf of Graham 
Johnston <johnston.grah...@gmail.com>
Sent: Monday, July 19, 2021 12:19 PM
To: Saku Ytti <s...@ytti.fi>
Cc: nanog list <nanog@nanog.org>
Subject: Re: 100G, input errors and/or transceiver issues

Saku,

I don't at this point have long term data collection compiled for the issues 
that we've faced. That said, we have two 100G transport links that have a 
regular background level of input errors at ranges that hover between 0.00055 
to 0.00383 PPS on one link, and none to 0.00135 PPS (that jumped to 0.03943 PPS 
over the weekend). The range is often directionally associated rather than 
variable behavior of a single direction. The data comes from the last 24 hours, 
the two referenced links are operated by different providers on very different 
paths (opposite directions). Over shorter distances, we've definitely seen 
input errors that have affected PNI connections within a datacenter as well. In 
the case of the last PNI issue, the other party swapped their transceiver, we 
didn't even physically touch our side; I note this only to express that I don't 
think this is just a case of the transceivers that we are sourcing.

Comparatively, other than clear transport system issues, I don't recall this 
sort of thing at all with 10G "wavelength" transport that we had purchased for 
years prior. I put wavelengths in quotes there knowing that it may have been a 
while since our transport was a literal wavelength as compared to being muxed 
into a 100G+ wavelength.

On Mon, 19 Jul 2021 at 12:01, Saku Ytti <s...@ytti.fi<mailto:s...@ytti.fi>> 
wrote:
On Mon, 19 Jul 2021 at 19:47, Graham Johnston
<johnston.grah...@gmail.com<mailto:johnston.grah...@gmail.com>> wrote:

Hey Graham,

> How commonly do other operators experience input errors with 100G interfaces?
> How often do you find that you have to change a transceiver out? Either for 
> errors or another reason.
> Do we collectively expect this to improve as 100G becomes more common and 
> production volumes increase in the future?

New rule. Share your own data before asking others to share theirs.

IN DC, SP markets 100GE has dominated the market for several years
now, so it rings odd to many at 'more common'. 112G SERDES is shipping
on the electric side, and there is nowhere more mature to go from
100GE POV. The optical side, QSFP112, is really the only thing left to
cost optimise 100GE.
We've had our share of MSA ambiguity issues with 100GE, but today
100GE looks mature to our eyes in failure rates and compatibility. 1GE
is really hard to support and 10GE is becoming problematic, in terms
of hardware procurement.


--
  ++ytti

Reply via email to