[
https://issues.apache.org/jira/browse/NIFI-14095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Filip Maretić updated NIFI-14095:
---------------------------------
Description:
Just setting the *KeepSourceFile* property to *true* can cause continuous
ingestion of files into NiFi. If the file is big (e.g. 20 GB) this can cause
the content repository (e.g. size of 400 GB) to be filled in an instant. This
renders the NiFi node unusable and a cleanup is needed. There is no reason for
this to happen, the flow should at least have enough time to process a chunk of
such a huge file before attempting to load the same file again.
A quick solution would be just to add
{code:java}
@DefaultSchedule(strategy = SchedulingStrategy.TIMER_DRIVEN, period = "1 min")
{code}
This is anyway present on the ListFile processor, so why not to add it here
also? if the user really wants to set this to 0 seconds I guess he should be
aware of the consequences.
was:
Just setting the *KeepSourceFile* property to *true* can cause continuous
ingestion of files into NiFi. If the file is big (e.g. 20 GB) this can cause
the content repository (e.g. size of 400 GB) to be filled in an instant. This
renders the NiFi node unusable and a cleanup is needed. There is no reason for
this to happen, the flow should at least have enough time to process a chunk of
such a huge file before attempting to load the same file again.
A quick solution would be just to add
{code:java}
@DefaultSchedule(strategy = SchedulingStrategy.TIMER_DRIVEN, period = "1 min")
{code}
This is anyone present on the ListFile processor, so why not to add it here
also?
> GetFile - "KeepSourceFile" set to true can fill up content repository
> ---------------------------------------------------------------------
>
> Key: NIFI-14095
> URL: https://issues.apache.org/jira/browse/NIFI-14095
> Project: Apache NiFi
> Issue Type: Improvement
> Components: Configuration
> Affects Versions: 2.0.0, 1.28.1
> Reporter: Filip Maretić
> Priority: Major
> Labels: GetFile, ListFile
> Fix For: 2.1.0
>
>
> Just setting the *KeepSourceFile* property to *true* can cause continuous
> ingestion of files into NiFi. If the file is big (e.g. 20 GB) this can cause
> the content repository (e.g. size of 400 GB) to be filled in an instant. This
> renders the NiFi node unusable and a cleanup is needed. There is no reason
> for this to happen, the flow should at least have enough time to process a
> chunk of such a huge file before attempting to load the same file again.
> A quick solution would be just to add
> {code:java}
> @DefaultSchedule(strategy = SchedulingStrategy.TIMER_DRIVEN, period = "1 min")
> {code}
> This is anyway present on the ListFile processor, so why not to add it here
> also? if the user really wants to set this to 0 seconds I guess he should be
> aware of the consequences.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)