Hi Luc,

Thanks for your response and consideration.
Please see zzh> below.



Juniper Business Use Only
From: Luc André Burdet <laburdet.i...@gmail.com>
Sent: Friday, April 26, 2024 3:42 PM
To: Jeffrey (Zhaohui) Zhang <zzh...@juniper.net>; 'BESS' <bess@ietf.org>
Subject: Re: DF election text in RFC7432/7432bis and 
draft-ietf-bess-evpn-fast-df-recovery

[External Email. Be cautious of content]

Hi Jeffrey,

#3 is the one that should be updated, only the recovering PE starts a timer.

   3.  When the timer expires, each PE builds an ordered list of the IP
       addresses of all the PE nodes connected to the Ethernet segment
       (including itself), in increasing numeric value. ...
to:
   3.  Each PE builds an ordered list of the IP
       addresses of all the PE nodes connected to the Ethernet segment
       (including itself), in increasing numeric value. ...

The correction is really just to remove that statement re: timer expiry.  All 
the Peers build the list, only recovering timer has a “3s window to receive 
routes” which is meant to prevent the rapid-reshuffle when recovering PE gets 
1,2,3... peer routes in succession.
With this change, the text is aligned with the RFC8584 FSM. In a nutshell: 
failures on the DF are resolved as fast as possible. Recoveries of the former 
DF depend on the timer.

Zzh> Yes, this does align with the RFC8584. The previous “upon timer expiry” 
was adding confusion (even though it’s better for everyone to wait for the 
timer – then the only issue is with the transmission delay – propagation and 
scheduling in various parts of the machinery).
Zzh> I don’t understand the “In a nutshell: failures on the DF are resolved as 
fast as possible” part though – there is no DF failure here – we’re talking 
about a new ES route is originated, not that a previously valid ES route is 
lost.

I don’t think the “timer should be the same on all PEs in ES” statement is 
harmful? It’s just good practice and that ‘should’ is not normative.

Zzh> It’s not harmful, just really no use. That statement together with the 
“upon timer expiry” wording, really leads to one to believe that all PEs do the 
election at roughly the same time (barring the transmission delay). Now that 
we’re making it clear that is not the case, we might as well remove the 
unnecessary “timer should be the same on all PEs in ES” statement.

For Fast-DF-Recovery, it is not the CALC that is delayed, it is the 
application.  The calculation itself continues to be same as RFC7432/bis.  Only 
applying the result to HW is delayed on “other PEs”, which the (updated) FSM 
reflects.
Section 2.2 of the fast-df recovery draft is correct the way it is described: 
it refers to delaying the transition from DF_CALC to DF_DONE, which is not to 
be confused with the initial/discovery timer which is a locally configured 
timer. The delay between DF_CALC and DF_DONE is driven by the received SCT in 
the ES route not a local config.

Zzh> My previous reasoning: while this could be viewed as an implementation 
detail, it is implementation-wise easier and specification-wise cleaner to 
delay the DF_CALC itself. Every PE starts a timer that expires at the same 
absolute time, though the timer duration is different – depending on when a PE 
receives the new ES route with the SCT.

Zzh> Then I realize that the “skew” is probably what leads to the current text, 
and that makes sense. However, upon further reading of the “skew” text, I think 
the following inconsistencies need to be corrected.
Zzh> On an existing PE, the skew depends on the DF role (which could be 
different for different VLANs on the same ES) according to Section 3:


   In fact, PE1 should carve slightly before PE2 (skew) to maintain the

   preference of minimal loss over duplicate traffic.  The previously

   inserted PE2 that is recovering performs both transitions DF to NDF

   and NDF to DF per VLANs at the timer's expiry.  Since the goal is to

   prevent duplicates, the original PE1, which received the SCT will

   apply:



   *  DF to NDF transition at t=SCT minus skew, where both PEs are NDF

      for 'skew' amount of time



   *  NDF to DF transition at t=SCT

Zzh> However, the following text in section 2 does not talk about DF role at 
all. If one follows the following paragraph, then even the existing PEs would 
better go through the DF_WAIT state (my previous reasoning). This paragraph 
should be clarified with the DF role implications.

   Upon receipt of that new BGP Extended Community, partner PEs can
   determine the service carving time of the newly insterted PE.  The
   notion of skew is introduced to eliminate any potential duplicate
   traffic or loops.  The receiving partner PEs add a skew (default =
   -10ms) to the Service Carving Time to enforce this.  The previously
   inserted PE(s) must carve first, followed shortly (skew) by the newly
   insterted PE.

Zzh> In addition, in the section 2 text, the “previously inserted PEs” seem to 
refer to the existing/partner PEs while the “newly inserted PE” seems to refer 
to the new/recovering PE that originated the new ES route.
Zzh> However, in the section 3 text, the “previously inserted PE2” is 
“recovering”, hence the same as the “newly inserted PE”.
Zzh> One more inconsistency: in the section 2 text, we talk about adding a 
negative skew. In section 3, we talk about minus a (positive) skew. We should 
be consistent about the skew itself.
Zzh> Jeffrey

Regards,
Luc André

Luc André Burdet |  Cisco  |  
laburdet.i...@gmail.com<mailto:laburdet.i...@gmail.com>  |  Tel: +1 613 254 4814




Juniper Business Use Only
From: BESS <bess-boun...@ietf.org<mailto:bess-boun...@ietf.org>> on behalf of 
Jeffrey (Zhaohui) Zhang 
<zzhang=40juniper....@dmarc.ietf.org<mailto:zzhang=40juniper....@dmarc.ietf.org>>
Date: Thursday, April 4, 2024 at 16:25
To: 'BESS' <bess@ietf.org<mailto:bess@ietf.org>>
Subject: [bess] DF election text in RFC7432/7432bis and 
draft-ietf-bess-evpn-fast-df-recovery
Hi,

I discussed this offline with a few people before. I want to bring it up here 
to make sure that consistent text is used 7432bis and relevant drafts.

https://datatracker.ietf.org/doc/html/draft-ietf-bess-rfc7432bis-08#name-designated-forwarder-electi<https://urldefense.com/v3/__https:/datatracker.ietf.org/doc/html/draft-ietf-bess-rfc7432bis-08*name-designated-forwarder-electi__;Iw!!NEt6yMaO-gk!FDMNnF4eKBtF--LMs2r2btGlo6ah6NOBBcgbW0drdZbgnUztaqjBjUiZSPfs5TMQCY8UnVRAmX6udBXpvyF67aSD$>
 says:

   1.  When a PE discovers the ESI of the attached Ethernet segment, it
       advertises an Ethernet Segment route with the associated
       ES-Import extended community.

   2.  The PE then starts a timer (default value = 3 seconds) to allow
       the reception of Ethernet Segment routes from other PE nodes
       connected to the same Ethernet segment.  This timer value should
       be the same across all PEs connected to the same Ethernet
       segment.

   3.  When the timer expires, each PE builds an ordered list of the IP
       addresses of all the PE nodes connected to the Ethernet segment
       (including itself), in increasing numeric value. ...

#2 says "the PE" (the new PE coming up on that ES) starts a timer. It does not 
mention if other PEs start a timer or not.
#3 says "when the timer expires, each PE ..."

Based on this existing text, #2 should be updated to "each PE then starts a 
timer". However, RFC8584's FSM makes it clear that existing PEs don't wait. 
Therefore, #3 should be updated. In addition, if it is only the new PE that 
starts the timer, then "This timer value should be the same across all PEs 
connected to the same Ethernet segment" in #2 is no longer needed.

I also wonder if in the 
https://datatracker.ietf.org/doc/html/draft-ietf-bess-evpn-fast-df-recovery#name-updates-to-rfc8584<https://urldefense.com/v3/__https:/datatracker.ietf.org/doc/html/draft-ietf-bess-evpn-fast-df-recovery*name-updates-to-rfc8584__;Iw!!NEt6yMaO-gk!FDMNnF4eKBtF--LMs2r2btGlo6ah6NOBBcgbW0drdZbgnUztaqjBjUiZSPfs5TMQCY8UnVRAmX6udBXpv2tGbJ2E$>
 we should transition from DF_DONE to DF_WAIT instead of DF_CALC. Of course, 
the existing/peering PE's wait time is different from the new PE - the wait 
time is determined based on the received absolute SCT. This way, we have 
consistent behavior for the new and existing PEs.

Thanks.
Jeffrey



Juniper Business Use Only

_______________________________________________
BESS mailing list
BESS@ietf.org<mailto:BESS@ietf.org>
https://www.ietf.org/mailman/listinfo/bess<https://urldefense.com/v3/__https:/www.ietf.org/mailman/listinfo/bess__;!!NEt6yMaO-gk!FDMNnF4eKBtF--LMs2r2btGlo6ah6NOBBcgbW0drdZbgnUztaqjBjUiZSPfs5TMQCY8UnVRAmX6udBXpvy42dur0$>
_______________________________________________
BESS mailing list
BESS@ietf.org
https://www.ietf.org/mailman/listinfo/bess

Reply via email to