[
https://issues.apache.org/jira/browse/NIP-6?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17950249#comment-17950249
]
Mark Payne commented on NIP-6:
------------------------------
This vote has passed:
[https://lists.apache.org/thread/vw81wshgt7ft0pcynmlvz26bxlzos934]
> Allow Processors to understand state of incoming connections' backpressure
> --------------------------------------------------------------------------
>
> Key: NIP-6
> URL: https://issues.apache.org/jira/browse/NIP-6
> Project: NiFi Improvement Proposal
> Issue Type: New Feature
> Reporter: Mark Payne
> Priority: Major
>
> *Motivation and description*
> There are cases when we need to know whether or not backpressure is applied
> on incoming connections. For example, in MergeContent, when a Correlation
> Attribute is used, it is very easy to have a situation in which the Min
> Number of Entries or Min Bin Size is not met, but in which backpressure is
> applied upstream. In this case, we have to just wait for the Max Bin Age to
> elapse in order for MergeContent to progress. This introduces significant
> artificial delays into our dataflow. And while Max Bin Age is always
> recommended, it's not required, so the flow could potentially block
> indefinitely it not properly configured.
> Instead, it would make sense for MergeContent to determine "all upstream
> connections have backpressure applied" and as a result immediately merge the
> "most full bin" in order to make progress. This information is not currently
> available, though. We need to surface this information via the
> {{{}ProcessContext{}}}.
>
> *Scope*
> Implementing this will require that the following method be added to
> {{{}ProcessContext{}}}:
> {code:java}
> BackpressureEngagement getBackpressureEngagement(); {code}
> where {{BackpressureEngagement}} is defined as:
> {code:java}
> public enum BackpressureEngagement {
> // No upstream connections have backpressure engaged
> BACKPRESSURE_NOT_ENGAGED,
> // At least one upstream connection has backpressure engaged but not all
> BACKPRESSURE_PARTIALLY_ENGAGED,
> // All upstream connections have backpressure engaged
> BACKPRESSURE_ENGAGED;
> } {code}
> This, naturally, means that the the method must also be implemented in the
> {{StandardProcessContext}} as well as {{ConnectableProcessContext}} and
> {{{}MockProcessContext{}}}.
> In order to allow proper testing of components, we must also update the
> {{TestRunner}} interface and {{StandardProcessorTestRunner}} implementation
> to accommodate this with a setter method:
> {code:java}
> public void setBackpressureEngagement(BackpressureEngagement engagement);
> {code}
> The default value in {{StandardProcessorTestRunner}} should be
> {{{}BackpressureEngagement.BACKPRESSURE_NOT_ENGAGED{}}}.
>
> *Alternatives Considered*
> As outlined in NIFI-14366, in addition to the prescribed values for
> {{BackpressureEngagement}} the following alternatives were considered:
> {code:java}
> - SINGLE
> - IN_A_RELATIONSHIP
> - ITS_COMPLICATED
> - ENGAGED {code}
> Ultimately, I think we should avoid this nomenclature for the following
> reasons:
> * May lead to confusion. I fear we would also need to introduce a value of
> "GHOSTED" which would be confusing because it means something very different
> from when a Processor or Controller Service is Ghosted.
> * Backpressure is tied to a Connection, and Connections are already often
> confused with Processor Relationships. This may well further blur that line
> and lead to more confusion.
> * I'm concerned about the complexity of implementing `ITS_COMPLICATED`. Once
> we start factoring in the emotional state of each connection, well... it gets
> complicated.
> * I'd be concerned about "API sprawl" - for instance, will someone then want
> to introduce a `breakup()` method to set the status to SINGLE? The error
> handling there could be a nightmare.
> It was also suggested that we name the method {{amITheProblem}} with
> potential return values of {{YES_YOU_ARE_SLOW}} or {{{}NO_ITS_ME{}}}.
> My main concern here is that the value can potentially change rather quickly.
> Imagine the logs, indicating why backpressure is engaged:
> {code:java}
> Backpressure engaged? YES_YOU_ARE_SLOW
> Backpressure engaged? NO_ITS_ME
> Backpressure engaged? YES_YOU_ARE_SLOW
> Backpressure engaged? NO_ITS_ME
> Backpressure engaged? YES_YOU_ARE_SLOW
> Backpressure engaged? NO_ITS_ME
> Backpressure engaged? YES_YOU_ARE_SLOW
> Backpressure engaged? NO_ITS_ME{code}
> We don't need people thinking of our api is trapped in a loop of self-doubt
> and finger-pointing.
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)