Thanks for your replay Vino,
I've checked your solution and it may solve my requirements. However it
presents a newer issue: How can 2 different jobs interact with?As said,
I'm new at this, and all I know is the basis to create and interconect
diffent datastreams over the same job / java.class.
I know a possible solution may be to create TCP sockets and interconect
the jobs by myself, but my final objetive here is to evaluate how flink
performs over different media and data while I measure it's vertical
scalability (for an academia study + my phd). In the custom tcp-socket
solution, I would be testing my own (and maybe not the best)
implementation instead the flink's one.
Is it possible to interconnect two different jobs using flink's API? I
did not found much on google but maybe because I dont know what I
really should search for. If there is a way to interconect 2 jobs then
your proposed solution should work fine, if not, should I assume flink
current implementation won't allow me to select "job-steps" over
"defined-nodes"?Thanks again Vino and anyone who helps,
Rafael.
El lun, 13-08-2018 a las 10:22 +0800, vino yang escribió:
> Hi Rafael,
> For Standalone clusters, it seems that Flink does not provide such a
> feature.
> In general, at the execution level, we don't talk about DataStream,
> but we talk about Job.
> If your Flink is running on YARN, you can use YARN's Node Label
> feature to assign a Label to some Nodes.
> Earlier this year, I had solved an issue that could solve the problem
> of specifying a node label when submitting a job for Flink on
> YARN.[1]
> This feature is available in the recently released Flink 1.6.0.
> Don't know if it meets your requirements?
>
> [1]: https://issues.apache.org/jira/browse/FLINK-7836
>
> Thanks, vino.
> Rafael Leira Osuna <rafael.le...@uam.es> 于2018年8月13日周一 上午12:16写道:
> > Hi!
> >
> >
> >
> > I have been searching a lot but I didn't found a solution for this.
> >
> >
> >
> > Lets supose some of the steps on the streaming process must be
> > executed
> >
> > in just a subset of the available nodes/taskmanagers, while the
> > rest of
> >
> > the tasks are free to be computed anywhere.
> >
> >
> >
> > **¿How can I assign a DataStream to be executed ONLY in a node
> >
> > subset?**
> >
> >
> >
> > This is required mainly for input/sink tasks as not every node in
> > the
> >
> > cluster have the same conectivity / security restrictions.
> >
> >
> >
> > I'm new on flink, so please forgive me if I'm asking for something
> >
> > obvious.
> >
> >
> >
> > Thanks a lot.
> >
> >
> >
> > Rafael Leira.
> >
> >
> >
> > Pd: Currently, we have a static standalone flink cluster.
> >