Mark, I am not positive it makes sense for this to be wired into the framework directly. The parallel construct I think of is sort of how Operators work in the Kubernetes world work. I could see something like that having the smarts to codify/automate this logic. I could see this logic living in a CI/CD process like a Jenkins job. All of the fundamental APIs to 'do this' exist already it is just the 'thing with the smarts to tell NiFi how to behave' that need to occur.
Anyway - just my off the cuff thoughts. Joe On Thu, Aug 19, 2021 at 11:35 AM Mark Bean <[email protected]> wrote: > > Thanks for the feedback, Joe. For my scenario, I'd like to focus on > app/framework solutions. It's not too difficult to poke holes in nearly all > options though. My original suggestion matches your toothpaste analogy, and > seems to me to be the best area to focus. The other options are not > desirable. > > Auto delete state results in data loss. Not acceptable in many cases. The > move option is interesting, but also becomes complicated. For example, > suppose the version change removed a connection, and also removed the > downstream processor. Now, if the queued data was moved - without going > through that downstream processor - it may have become invalid; not much > different than deleting the data. > > So, back to the toothpaste.. does it make sense to do this in an automated > way? It would require a delay in version change processing if we let the > framework do the squeezing. And, even then, there are no guarantees; the > queue(s) might still not become empty. It would only increase the chance of > emptying. Is it possible to agree on what an "acceptable" delay is? Is > there support for this type of approach to be built into the framework? > > Thanks, > Mark > > On Thu, Aug 19, 2021 at 2:13 PM Joe Witt <[email protected]> wrote: > > > Mark, > > > > Ultimately we have a versioned state machine in which we want to > > change the definition of the state machine while there is state in the > > machine. > > > > Here the case is that we have state which no longer has a home in the > > new version of the machine. > > > > We detect this case as you note and prevent the user from performing > > this action until they solve that fundamental gap which we could not > > possibly know the need for automatically. > > > > Solutions are in the camp of > > 1. Things a user/process could do. > > - Manual stuff. Stop components to let the state bleed out. Then do > > version change. OR delete the state. Then do version change. > > - Make sure flow version changes use existing connections > > meaningfully where possible. Not perfect but helpful. Gaps remain. > > > > 2. Things we can do from an app/framework point of view. > > - Stop flows to sort of squeeze the toothpaste out. Timing not > > reliable here but would probably work in most cases reasonably fast. > > - Give an option to auto delete state which no longer has a home. > > - Give an option to 'move' state (flowfiles) from a now orphaned > > connection to a now existing connection in the new version. > > > > This is the blessing and curse of operating a durable state machine > > and supporting version control changes as if the flow definition > > exists at a point in time independent of data/state flowing through > > it. > > > > Joe > > > > On Thu, Aug 19, 2021 at 11:05 AM Mark Bean <[email protected]> wrote: > > > > > > Scenario: > > > Using NiFi Registry to version control changes to the graph. A > > development > > > system is used to make all changes. Once the changes are "vetted", the > > > production system will pull down the latest version to get the changes. > > > Further, the goal is to automate the version update, and not require an > > > operator to manually perform the version update through the UI. > > > > > > Now, consider the case where the new version removes a connection. When > > the > > > production system attempts to apply the new version, it will fail if > > there > > > are flowfiles in the queue of that removed connection. That's good in > > that > > > it prevents data loss. However, it prevents the versioned process group > > > from updating. > > > > > > Are there suggested solutions to this - besides the obvious of manually > > > stopping the upstream flow and waiting for queue(s) to empty? > > > > > > One option we discussed was to allow components affected by the version > > > change to be stopped "smartly". They would be ordered such that upstream > > > components are stopped first, then downstream, and lastly controller > > > services. (Is that being done currently?) Additionally, if a processor > > has > > > flowfiles in an upstream queue (that is being removed by the version > > > change), it would delay stopping the processor by some period of time > > thus > > > giving the opportunity for that queue to empty. Granted, this could > > become > > > problematic if there are many components that fall into this category. > > > Also, what is an appropriate "period of time"? For example, a delay of > > even > > > 2 seconds could result in the overall process taking over a minute if > > there > > > are 30+ such connections. > > > > > > Comments are welcome. > > > > > > Thanks, > > > Mark > >
