ChrisSamo632 commented on PR #9706:
URL: https://github.com/apache/nifi/pull/9706#issuecomment-2686119583

   > I'm concerned about the potential memory impact on building a `Map` of 
error records, as opposed to writing them directly. I haven't traced the 
details to consider the issues with going in that direction, did you consider 
an approach that avoids keeping error Records in memory?
   
   Fair quesiton, and it's probably possible, provided careful handling of the 
additional Reader for the Input FlowFile.
   
   I'd opted to separate such handling into a try-with-resources method in 
order to ensure we didn't accidentally leave any file handles open. Accepting, 
however, that we'd have a duplication of the Records in memory - they're 
already stored in the operation list that's being sent to Elasticsearch and 
iterated through to check for Errors. This means double the memory use 
(temporarily for each batch of data being processed), but greater confidence 
around the FlowFile reader handling.
   
   I can certainly look again at reading each Record from the Input FlowFile 
individually and either using it immediately or throwing it away as we loop 
through the responses checking for errors, but being careful to ensure all 
Readers are closed at the end.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to