Github user ijokarumawak commented on a diff in the pull request:
https://github.com/apache/nifi/pull/2435#discussion_r164017503
--- Diff:
nifi-nar-bundles/nifi-atlas-bundle/nifi-atlas-reporting-task/src/main/java/org/apache/nifi/atlas/NiFiAtlasHook.java
---
@@ -255,7 +255,11 @@ public void commitMessages() {
}
return new Tuple<>(refQualifiedName,
typedQualifiedNameToRef.get(toTypedQualifiedName(typeName, refQualifiedName)));
}).filter(Objects::nonNull).filter(tuple -> tuple.getValue() !=
null)
- .collect(Collectors.toMap(Tuple::getKey, Tuple::getValue));
+ // If duplication happens, use new value.
+ .collect(Collectors.toMap(Tuple::getKey, Tuple::getValue,
(oldValue, newValue) -> {
+ logger.warn("Duplicated qualified name was found, use
the new one. oldValue={}, newValue={}", new Object[]{oldValue, newValue});
+ return newValue;
+ }));
--- End diff --
While I was testing, I got the following exception:
```
2018-01-25 05:06:41,430 ERROR [Timer-Driven Process Thread-1]
o.a.n.a.reporting.ReportLineageToAtlas
ReportLineageToAtlas[id=057986ae-0161-1000-d0b0-1b890a17f5aa] Error running
task ReportLineageToAtlas[id=057986ae-0161-1000-d0b0-1b890a17f5aa] due to
java.lang.IllegalStateException: Duplicate key {Id='(type: fs_path, id:
69be7a40-4ff8-4c4e-b714-2d394c14398d)', traits=[], values={}} NiFiAtlasHook.258
```
The exception means, an existing nifi_flow_path entity has more than one
entries having pointing to the same entity having identical qualified name,
from its inputs or outputs attribute. This happened because I was using the old
test environment which has data created before Atlas integration implemented
de-duplication logic. However, it would be more protective to handle such
duplication in case if this occurs for some other reason.
---