propagate to the output to understand that processing
> has finished.
>
>
> Again, thanks everyone for your help!
> - Sergii
>
> On Mon, May 18, 2020 at 8:45 AM Thomas Huang wrote:
>
>> Hi,
>>
>> Actually, seems like spark dynamic allocation saves more resource
case.
>
> --
> *From:* Arvid Heise
> *Sent:* Monday, May 18, 2020 11:15:09 PM
> *To:* Congxian Qiu
> *Cc:* Sergii Mikhtoniuk ; user <
> user@flink.apache.org>
> *Subject:* Re: Process available data and stop with savepoint
>
> Hi Se
Hi,
Actually, seems like spark dynamic allocation saves more resources in that
case.
From: Arvid Heise
Sent: Monday, May 18, 2020 11:15:09 PM
To: Congxian Qiu
Cc: Sergii Mikhtoniuk ; user
Subject: Re: Process available data and stop with savepoint
Hi Sergii
Hi Sergii,
your requirements feel a bit odd. It's neither batch nor streaming.
Could you tell us why it's not possible to let the job run as a streaming
job that runs continuously? Is it just a matter of saving costs?
If so, you could monitor the number of records being processed and trigger
stop
Hi Sergii
If I understand correctly, you want to process all the files in some
directory, and do not want to process them multiple times. I'm not sure if
using `FileProcessingMode#PROCESS_CONTINUOUSLY`
instead of `FileProcessingMode#PROCESS_ONCE`[1] can satisfy your needs, and
keep the job running
Hello,
I'm migrating my Spark-based stream processing application to Flink
(Calcite SQL and temporal tables look too attractive to resist).
My Spark app works as follows:
- application is started periodically
- it reads a directory of Parquet files as a stream
- SQL transformations are applied
-