Re: trigger once (batch job with streaming semantics)

Martijn Visser Tue, 10 May 2022 04:18:18 -0700

Hi Georg,

I'm not aware of those examples being available publicly.


Best regards,

Martijn

On Mon, 9 May 2022 at 23:04, Georg Heiler <georg.kf.hei...@gmail.com> wrote:

> Hi Martijn,
>
> many thanks for this clarification. Do you know of any example somewhere
> which would showcase such an approach?
>
> Best,
> Georg
>
> Am Mo., 9. Mai 2022 um 14:45 Uhr schrieb Martijn Visser <
> martijnvis...@apache.org>:
>
>> Hi Georg,
>>
>> No they wouldn't. There is no capability out of the box that lets you
>> start Flink in streaming mode, run everything that's available at that
>> moment and then stops when there's no data anymore. You would need to
>> trigger the stop yourself.
>>
>> Best regards,
>>
>> Martijn
>>
>> On Fri, 6 May 2022 at 13:37, Georg Heiler <georg.kf.hei...@gmail.com>
>> wrote:
>>
>>> Hi,
>>>
>>> I would disagree:
>>> In the case of spark, it is a streaming application that is offering
>>> full streaming semantics (but with less cost and bigger latency) as it
>>> triggers less often. In particular, windowing and stateful semantics as
>>> well as late-arriving data are handled automatically using the regular
>>> streaming features.
>>>
>>> Would these features be available in a Flink Batch job as well?
>>>
>>> Best,
>>> Georg
>>>
>>> Am Fr., 6. Mai 2022 um 13:26 Uhr schrieb Martijn Visser <
>>> martijnvis...@apache.org>:
>>>
>>>> Hi Georg,
>>>>
>>>> Flink batch applications run until all their input is processed. When
>>>> that's the case, the application finishes. You can read more about this in
>>>> the documentation for DataStream [1] or Table API [2]. I think this matches
>>>> the same as Spark is explaining in the documentation.
>>>>
>>>> Best regards,
>>>>
>>>> Martijn
>>>>
>>>> [1]
>>>> https://nightlies.apache.org/flink/flink-docs-master/docs/dev/datastream/execution_mode/
>>>> [2]
>>>> https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/common/
>>>>
>>>> On Mon, 2 May 2022 at 16:46, Georg Heiler <georg.kf.hei...@gmail.com>
>>>> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> spark
>>>>> https://spark.apache.org/docs/latest/structured-streaming-programming-guide.html#triggers
>>>>> offers a variety of triggers.
>>>>>
>>>>> In particular, it also has the "once" mode:
>>>>>
>>>>> *One-time micro-batch* The query will execute *only one* micro-batch
>>>>> to process all the available data and then stop on its own. This is useful
>>>>> in scenarios you want to periodically spin up a cluster, process 
>>>>> everything
>>>>> that is available since the last period, and then shutdown the cluster. In
>>>>> some case, this may lead to significant cost savings.
>>>>>
>>>>> Does flink have a similar possibility?
>>>>>
>>>>> Best,
>>>>> Georg
>>>>>
>>>>

Re: trigger once (batch job with streaming semantics)

Reply via email to