IMO we should start working on NiFi 2.0 going forward and it sounds like a good opportunity to make such changes in our components.
Le mar. 25 oct. 2022 à 19:33, Mike Thomsen <[email protected]> a écrit : > The hash-based deduplication strategy used the built-in "md5" > attribute to offload the work to the database. That functionality was > deprecated and AFAICT gone as of Mongo 5: > > https://www.mongodb.com/docs/manual/core/gridfs/#files.md5 > > I am proposing two changes: > > * Remove deduplication > * Create a MongoDB DistributedMapCache client that can query on the > file metadata since GridFS stores metadata separately from chunks > making lookups that way cheap and flexible. > > I could easily add that to this PR which already covers Testcontainers > integration, making it super easy to test the changed behavior: > > https://github.com/apache/nifi/pull/6460 > > Thoughts? >
