ChrisSamo632 commented on PR #9706: URL: https://github.com/apache/nifi/pull/9706#issuecomment-2686119583
> I'm concerned about the potential memory impact on building a `Map` of error records, as opposed to writing them directly. I haven't traced the details to consider the issues with going in that direction, did you consider an approach that avoids keeping error Records in memory? Fair quesiton, and it's probably possible, provided careful handling of the additional Reader for the Input FlowFile. I'd opted to separate such handling into a try-with-resources method in order to ensure we didn't accidentally leave any file handles open. Accepting, however, that we'd have a duplication of the Records in memory - they're already stored in the operation list that's being sent to Elasticsearch and iterated through to check for Errors. This means double the memory use (temporarily for each batch of data being processed), but greater confidence around the FlowFile reader handling. I can certainly look again at reading each Record from the Input FlowFile individually and either using it immediately or throwing it away as we loop through the responses checking for errors, but being careful to ensure all Readers are closed at the end. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
