Brandon DeVries created NIFI-3566:
-------------------------------------
Summary: Node fails to pull flow.xml from NCM, purges content repo
Key: NIFI-3566
URL: https://issues.apache.org/jira/browse/NIFI-3566
Project: Apache NiFi
Issue Type: Bug
Reporter: Brandon DeVries
Priority: Minor
Fix For: 0.7.1
We have an instance were a node was removed from a cluster to address a
production data flow issue. During this process, changes were made such that
it's flow.xml was different from the cluster (different run states). The
general procedure we follow in this case is to remove the Node's flow.xml, and
let is pull the "correct" / consistent one from the NCM. However, in this
case, something prevented the NCM's flow.xml from propagating to the Node. the
Node ended up with an empty flow.xml... and then proceeded to purge all of the
content repo with the warning "{} maps to unknown FlowFile Queue {}; this
record will be discarded"\[1].
In cases like this, we should see if we can be a bit more friendly.
Specifically, in our case, it would have been preferable to shut down rather
than delete the content repo. It would seem to me that if an admin
intentionally removes the flow.xml, it would not be unreasonable to make it
their responsibility to also remove the content repo (and possibly others...).
But cases in which a network hiccup can cause 100% data loss on a node seem bad.
\[1]
https://github.com/apache/nifi/blob/rel/nifi-0.7.1/nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-framework-core/src/main/java/org/apache/nifi/controller/repository/WriteAheadFlowFileRepository.java#L717
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)