Dear Developers

We have a situation where we see corrupted file after using PutSFTP and
FetchSFTP in NIFI 1.13.2 with openjdk version "1.8.0_292", OpenJDK Runtime
Environment (build 1.8.0_292-8u292-b10-0ubuntu1~20.04-b10), OpenJDK 64-Bit
Server VM (build 25.292-b10, mixed mode) running on a Ubuntu Server 20.04

We have a flow between 2 separated systems where we use a PUTSFTP to export
data from one NIFI instance to a datadiode and use FetchSFTP to grep data
on the other end. To be sure data is not corrupted we calculate a SHA256 on
each side, and transfer the flowfile metadata in a seperate file. In rare
cases have see that the SHA256 doesn't match on both sides and are
investigation where the errors happens. We see 2 errors. Manually
calculation a SHA256 on both side of the diodes the file is OK and we have
found that the errors at  happens between NIFI and the SFTP servers. And it
can happens at both sides.
So for testing I created this little flow:
GeneratingFlowFile (size 100MB) (Run once) ->
CryptographicHashContent (SHA256) ->
UpdateAttribute ( hash.root = ${content_SHA-256} , iteration=1) ->
PutSFTP ->
FetchSFTP ->
CryptographicHashContent (SHA256) ->
routeOnAttribute (compare root.hash vs.content_SHA-256)
    If unmatch ->
        Going to a disabled process for placeholding the corrupted file in
a file queue
    If match ->
        UpdateAttribute ( iteration= ${iteration:plus(1)} ) -> looping back
to PutSFTP

After 8992 iteration the file is corrupted. To test if the errors are in
the calculation of the SHA256 I have a copy of the flow without the
PUT/FETCH SFTP processors which haven't got any errors yet.

It is very rare that we see these errors, millions of files are going
through without any issues but some time it happens which is not good.

Can any one please help? Maybe trying to setup the same test and see if you
also have a corrupted file after some days.

Kind regards
Jens M. Kofoed

Reply via email to