Re: Data & Task distribution among the available Nodes

Shammon FY Thu, 29 Jun 2023 23:41:08 -0700

Hi Mahmoud,

For the third quest, currently flink uses Fine-Grained Resource Management
to choose a TM for tasks, you can refer to the doc [1] for more information.



[1]
https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/finegrained_resource/

Best,
Shammon FY


On Thu, Jun 29, 2023 at 4:17 PM Martijn Visser <martijnvis...@apache.org>
wrote:

> Hi Mahmoud,
>
> While it's not an answer to your questions, I do want to point out
> that the DataSet API is deprecated and will be removed in a future
> version of Flink. I would recommend moving to either the Table API or
> the DataStream API.
>
> Best regards,
>
> Martijn
>
> On Thu, Jun 22, 2023 at 6:14 PM Mahmoud Awad <mahmoud.a4...@hotmail.com>
> wrote:
> >
> > Hello everyone,
> >
> > I am trying to understand the mechanism by which Flink distributed the
> data and the tasks among the nodes/task managers in the cluster, assuming
> all TMs have equal resources. I am using the DataSet API on my own machine.
> > I will try to address the issue with the following questions :
> >
> > -When we  firstly read the data from the source(Text,CSV..etc.), How
> does Flink ensures the fairly distribution of data from the source to the
> next subtask ?
> >
> > -Are there any preferences by which Flink will prefer a task manager on
> the other(assuming all task managers have equal resources) ?
> >
> > - Based on what, will Flink choose to deploy a specific task in a
> specific task manager ?
> >
> > I hope I was able to explain my point, thank you in advanced.
> >
> > Best regards
> > Mahmoud
> >
> >
> >
> > Gesendet von Mail für Windows
> >
> >
>

Re: Data & Task distribution among the available Nodes

Reply via email to