Thanks for your reply!
But doesn't flink use stream to perform batch calculations? As you said
above, to some extent, it is  same as  real batch computing  eg.spark .

Caizhi Weng <tsreape...@gmail.com> 于2021年9月7日周二 下午2:53写道:

> My previous mail intends to answer what is needed for all subtasks in a
> batch job to run simultaneously. To just run a batch job the number of task
> slots can be as small as 1. In this case each parallelism of each subtask
> will run one by one.
>
> Also the scheduling of the subtasks depends on the shuffling mode
> (table.exec.shuffle-mode). By default all network shuffles in batch jobs
> are blocking, which means that the downstream subtasks will only start
> running after the upstream subtasks finish. To run all subtasks
> simultaneously you should set that to "pipelined" (Flink <= 1.11) or
> "ALL_EDGES_PIPELINED" (Flink >= 1.12).
>
> Caizhi Weng <tsreape...@gmail.com> 于2021年9月7日周二 下午2:47写道:
>
>> Hi!
>>
>> If you mean batch SQL then you'll need to prepare enough task slots for
>> all subtasks. The number of task slots needed is the sum of parallelism of
>> all subtasks as there is no slot reusing in batch jobs.
>>
>> lec ssmi <shicheng31...@gmail.com> 于2021年9月7日周二 下午2:13写道:
>>
>>> And My flink version is 1.11.0
>>>
>>> lec ssmi <shicheng31...@gmail.com> 于2021年9月7日周二 下午2:11写道:
>>>
>>>> Hi:
>>>>    I'm not familar with batch api .And I write a program  just like
>>>> "insert  into tab_a select  * from tab_b".
>>>>    From the picture, there are only two tasks, one is the source task
>>>> which is in RUNNING state. And the other one is sink task which is
>>>> always in CREATE state.
>>>>    According to logs, I found that  source task is reading the file I
>>>> specified now, in other words, it is working normally.
>>>>    Doesn't flink work after all operators are initialized?
>>>>
>>>>
>>>> [image: image.png]
>>>>
>>>

Reply via email to