Re: Select node for a Flink DataStream execution

vino yang Mon, 13 Aug 2018 04:35:12 -0700

Hi Rafael,

Flink does not support the interaction of DataStream in two Jobs.
I don't know what your scene is.
Usually if you need two stream interactions, you can import them into the
same job.
You can do this through the DataStream join/connect API.[1]


[1]:
https://ci.apache.org/projects/flink/flink-docs-master/dev/stream/operators/

Thanks, vino.

Rafael Leira Osuna <rafael.le...@uam.es> 于2018年8月13日周一 下午6:35写道：

> Thanks for your replay Vino,
>
> I've checked your solution and it may solve my requirements. However it
> presents a newer issue: How can 2 different jobs interact with?
> As said, I'm new at this, and all I know is the basis to create and
> interconect diffent datastreams over the same job / java.class.
>
> I know a possible solution may be to create TCP sockets and interconect
> the jobs by myself, but my final objetive here is to evaluate how flink
> performs over different media and data while I measure it's vertical
> scalability (for an academia study + my phd). In the custom tcp-socket
> solution, I would be testing my own (and maybe not the best) implementation
> instead the flink's one.
>
> Is it possible to interconnect two different jobs using flink's API? I did
> not found much on google but maybe because I dont know what I really should
> search for. If there is a way to interconect 2 jobs then your proposed
> solution should work fine, if not, should I assume flink current
> implementation won't allow me to select "job-steps" over "defined-nodes"?
> Thanks again Vino and anyone who helps,
>
> Rafael.
>
> El lun, 13-08-2018 a las 10:22 +0800, vino yang escribió:
>
> Hi Rafael,
>
> For Standalone clusters, it seems that Flink does not provide such a
> feature.
> In general, at the execution level, we don't talk about DataStream, but we
> talk about Job.
> If your Flink is running on YARN, you can use YARN's Node Label feature to
> assign a Label to some Nodes.
> Earlier this year, I had solved an issue that could solve the problem of
> specifying a node label when submitting a job for Flink on YARN.[1]
> *This feature is available in the recently released Flink 1.6.0.*
> Don't know if it meets your requirements?
>
> [1]: https://issues.apache.org/jira/browse/FLINK-7836
>
> Thanks, vino.
>
> Rafael Leira Osuna <rafael.le...@uam.es> 于2018年8月13日周一 上午12:16写道：
>
> Hi!
>
> I have been searching a lot but I didn't found a solution for this.
>
> Lets supose some of the steps on the streaming process must be executed
> in just a subset of the available nodes/taskmanagers, while the rest of
> the tasks are free to be computed anywhere.
>
> **¿How can I assign a DataStream to be executed ONLY in a node
> subset?**
>
> This is required mainly for input/sink tasks as not every node in the
> cluster have the same conectivity / security restrictions.
>
> I'm new on flink, so please forgive me if I'm asking for something
> obvious.
>
> Thanks a lot.
>
> Rafael Leira.
>
> Pd: Currently, we have a static standalone flink cluster.
>
>

Re: Select node for a Flink DataStream execution

Reply via email to