Brandon DeVries created NIFI-3566:
-------------------------------------

             Summary: Node fails to pull flow.xml from NCM, purges content repo
                 Key: NIFI-3566
                 URL: https://issues.apache.org/jira/browse/NIFI-3566
             Project: Apache NiFi
          Issue Type: Bug
            Reporter: Brandon DeVries
            Priority: Minor
             Fix For: 0.7.1


We have an instance were a node was removed from a cluster to address a 
production data flow issue.  During this process, changes were made such that 
it's flow.xml was different from the cluster (different run states).  The 
general procedure we follow in this case is to remove the Node's flow.xml, and 
let is pull the "correct" / consistent one from the NCM.   However, in this 
case, something prevented the NCM's flow.xml from propagating to the Node.  the 
Node ended up with an empty flow.xml... and then proceeded to purge all of the 
content repo with the warning "{} maps to unknown FlowFile Queue {}; this 
record will be discarded"\[1].

In cases like this, we should see if we can be a bit more friendly.  
Specifically, in our case, it would have been preferable to shut down rather 
than delete the content repo.  It would seem to me that if an admin 
intentionally removes the flow.xml, it would not be unreasonable to make it  
their responsibility to also remove the content repo (and possibly others...).  
But cases in which a network hiccup can cause 100% data loss on a node seem bad.

\[1] 
https://github.com/apache/nifi/blob/rel/nifi-0.7.1/nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-framework-core/src/main/java/org/apache/nifi/controller/repository/WriteAheadFlowFileRepository.java#L717



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to