[
https://issues.apache.org/jira/browse/NIFI-9574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17476646#comment-17476646
]
Mark Payne commented on NIFI-9574:
----------------------------------
[~mahieddine] so I'm not surprised that you're seeing issues with timeouts.
We see here that you have 6 cores with a core load average of nearly 18. In
simple terms, that means you're asking the CPU to do 3x more work than it can
handle.
And looking at the garbage collection we see massive amounts of garbage
collection - both Full GCs and smaller GCs. In fact it looks like more than 50%
of your time is spent performing garbage collection.
So you need to either update your flow to avoid using so much heap (by avoiding
creating lots of attributes/large attributes, by avoiding massive numbers of
flowfiles and instead prefer fewer larger flowfiles, and/or by avoiding loading
content into attributes), OR you need to add significantly more heap.
So the long and short of it is that for this flow, you need more memory and
more CPUs. It's quite possible that you could tune your flow to be more
efficient with your resources, but with the flow as-is, you really just need
more resources.
> Failed to decrypt data from Peer
> ---------------------------------
>
> Key: NIFI-9574
> URL: https://issues.apache.org/jira/browse/NIFI-9574
> Project: Apache NiFi
> Issue Type: Bug
> Reporter: Mahieddine Cherif
> Priority: Major
> Attachments: Screenshot 2022-01-14 at 19.21.11.png, Screenshot
> 2022-01-14 at 19.21.16.png, Screenshot 2022-01-14 at 19.21.36.png, Screenshot
> 2022-01-14 at 19.21.46.png
>
>
> After a migration to 1.15.2 it seems like we have almost systematically this
> error on our cluster all the time
> {code:java}
> Failed to communicate with Peer
> nifi-0.nifi-headless.apache-nifi.svc.cluster.local:8443 when load balancing
> data for Connection with ID b76e7297-e8a0-3b2b-ba30-d338db411301 due to
> java.io.IOException: Failed to decrypt data from Peer
> nifi-0.nifi-headless.apache-nifi.svc.cluster.local:8443 because Peer
> unexpectedly closed connection
> {code}
> Files get stuck in the queue mentioned, it's not always the same, these are
> simple round robin queues with or without compression.
> When we recreate the cluster it goes for like some time and then it occurs
> again and again
> Is there a particular reason ?
--
This message was sent by Atlassian Jira
(v8.20.1#820001)