Koji Kawamura created NIFI-4752:
-----------------------------------
Summary: REPLAY events returned by WriteAheadProvenanceRepository
have child FlowFile UUID as event FlowFile UUID
Key: NIFI-4752
URL: https://issues.apache.org/jira/browse/NIFI-4752
Project: Apache NiFi
Issue Type: Bug
Components: Core Framework
Affects Versions: 1.4.0
Reporter: Koji Kawamura
The ['Provenance Events'
documentation|https://nifi.apache.org/docs/nifi-docs/html/developer-guide.html#provenance_events]
describes about REPLAY event as follows:
{quate}
Indicates a provenance event for replaying a FlowFile. The UUID of the event
indicates the UUID of the original FlowFile that is being replayed. The event
contains one Parent UUID that is also the UUID of the FlowFile that is being
replayed and one Child UUID that is the UUID of the a newly created FlowFile
that will be re-queued for processing
{quate}
The default PersistentProvenanceRepository behaves as written in the doc. But
WriteAheadProvenanceRepository returns REPLAY events having Child UUID as its
FlowFile UUID instead.
Here is the lines of code that set FlowFile UUID for the provenance events.
https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-provenance-repository-bundle/nifi-persistent-provenance-repository/src/main/java/org/apache/nifi/provenance/schema/LookupTableEventRecord.java#L276-L280
WriteAheadProvenanceRepository does not seem to have 'FlowFile UUID' value
persisted, which is set by FlowController when replay events are registered.
Instead, WriteAheadProvenanceRepository fill 'FlowFile UUID' from updated or
previous 'UUID' attribute.
I don't know much background on why it is implemented this way, but it seems it
drops 'FlowFile UUID' to reduce IO based on an assumption that it can be set by
attributes.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)