Thanks Vino,
As an example of what a we need: We want 2 different tasks executed on
2 different nodes and choose which one is going to execute what.
Given what you have taught me:- A job can be assigned to a node in 1.6
with yarn- A set of tasks (datastream) are executed inside a job- Two
different jobs can't speak each other, nor its tasks
I would assume that what we want is not possible in flink. (Excluding
custom solutions like sockets or third parties like kafka)
Thanks for your help!.Rafael.
El lun, 13-08-2018 a las 19:34 +0800, vino yang escribió:
> Hi Rafael,
> Flink does not support the interaction of DataStream in two Jobs.
> I don't know what your scene is.
> Usually if you need two stream interactions, you can import them into
> the same job.
> You can do this through the DataStream join/connect API.[1]
>
> [1]: https://ci.apache.org/projects/flink/flink-docs-master/dev/strea
> m/operators/
>
> Thanks, vino.
> Rafael Leira Osuna <rafael.le...@uam.es> 于2018年8月13日周一 下午6:35写道:
> > Thanks for your replay Vino,
> > I've checked your solution and it may solve my requirements.
> > However it presents a newer issue: How can 2 different jobs
> > interact with?As said, I'm new at this, and all I know is the basis
> > to create and interconect diffent datastreams over the same job /
> > java.class.
> > I know a possible solution may be to create TCP sockets and
> > interconect the jobs by myself, but my final objetive here is to
> > evaluate how flink performs over different media and data while I
> > measure it's vertical scalability (for an academia study + my phd).
> > In the custom tcp-socket solution, I would be testing my own (and
> > maybe not the best) implementation instead the flink's one.
> > Is it possible to interconnect two different jobs using flink's
> > API? I did not found much on google but maybe because I dont know
> > what I really should search for. If there is a way to interconect 2
> > jobs then your proposed solution should work fine, if not, should I
> > assume flink current implementation won't allow me to select "job-
> > steps" over "defined-nodes"?Thanks again Vino and anyone who helps,
> > Rafael.
> > El lun, 13-08-2018 a las 10:22 +0800, vino yang escribió:
> > > Hi Rafael,
> > > For Standalone clusters, it seems that Flink does not provide
> > > such a feature.
> > > In general, at the execution level, we don't talk about
> > > DataStream, but we talk about Job.
> > > If your Flink is running on YARN, you can use YARN's Node Label
> > > feature to assign a Label to some Nodes.
> > > Earlier this year, I had solved an issue that could solve the
> > > problem of specifying a node label when submitting a job for
> > > Flink on YARN.[1]
> > > This feature is available in the recently released Flink 1.6.0.
> > > Don't know if it meets your requirements?
> > >
> > > [1]: https://issues.apache.org/jira/browse/FLINK-7836
> > >
> > > Thanks, vino.
> > > Rafael Leira Osuna <rafael.le...@uam.es> 于2018年8月13日周一 上午12:16写道:
> > > > Hi!
> > > >
> > > >
> > > >
> > > > I have been searching a lot but I didn't found a solution for
> > > > this.
> > > >
> > > >
> > > >
> > > > Lets supose some of the steps on the streaming process must be
> > > > executed
> > > >
> > > > in just a subset of the available nodes/taskmanagers, while the
> > > > rest of
> > > >
> > > > the tasks are free to be computed anywhere.
> > > >
> > > >
> > > >
> > > > **¿How can I assign a DataStream to be executed ONLY in a node
> > > >
> > > > subset?**
> > > >
> > > >
> > > >
> > > > This is required mainly for input/sink tasks as not every node
> > > > in the
> > > >
> > > > cluster have the same conectivity / security restrictions.
> > > >
> > > >
> > > >
> > > > I'm new on flink, so please forgive me if I'm asking for
> > > > something
> > > >
> > > > obvious.
> > > >
> > > >
> > > >
> > > > Thanks a lot.
> > > >
> > > >
> > > >
> > > > Rafael Leira.
> > > >
> > > >
> > > >
> > > > Pd: Currently, we have a static standalone flink cluster.
> > > >