Hi Rafael, Flink does not support the interaction of DataStream in two Jobs. I don't know what your scene is. Usually if you need two stream interactions, you can import them into the same job. You can do this through the DataStream join/connect API.[1]
[1]: https://ci.apache.org/projects/flink/flink-docs-master/dev/stream/operators/ Thanks, vino. Rafael Leira Osuna <rafael.le...@uam.es> 于2018年8月13日周一 下午6:35写道: > Thanks for your replay Vino, > > I've checked your solution and it may solve my requirements. However it > presents a newer issue: How can 2 different jobs interact with? > As said, I'm new at this, and all I know is the basis to create and > interconect diffent datastreams over the same job / java.class. > > I know a possible solution may be to create TCP sockets and interconect > the jobs by myself, but my final objetive here is to evaluate how flink > performs over different media and data while I measure it's vertical > scalability (for an academia study + my phd). In the custom tcp-socket > solution, I would be testing my own (and maybe not the best) implementation > instead the flink's one. > > Is it possible to interconnect two different jobs using flink's API? I did > not found much on google but maybe because I dont know what I really should > search for. If there is a way to interconect 2 jobs then your proposed > solution should work fine, if not, should I assume flink current > implementation won't allow me to select "job-steps" over "defined-nodes"? > Thanks again Vino and anyone who helps, > > Rafael. > > El lun, 13-08-2018 a las 10:22 +0800, vino yang escribió: > > Hi Rafael, > > For Standalone clusters, it seems that Flink does not provide such a > feature. > In general, at the execution level, we don't talk about DataStream, but we > talk about Job. > If your Flink is running on YARN, you can use YARN's Node Label feature to > assign a Label to some Nodes. > Earlier this year, I had solved an issue that could solve the problem of > specifying a node label when submitting a job for Flink on YARN.[1] > *This feature is available in the recently released Flink 1.6.0.* > Don't know if it meets your requirements? > > [1]: https://issues.apache.org/jira/browse/FLINK-7836 > > Thanks, vino. > > Rafael Leira Osuna <rafael.le...@uam.es> 于2018年8月13日周一 上午12:16写道: > > Hi! > > I have been searching a lot but I didn't found a solution for this. > > Lets supose some of the steps on the streaming process must be executed > in just a subset of the available nodes/taskmanagers, while the rest of > the tasks are free to be computed anywhere. > > **¿How can I assign a DataStream to be executed ONLY in a node > subset?** > > This is required mainly for input/sink tasks as not every node in the > cluster have the same conectivity / security restrictions. > > I'm new on flink, so please forgive me if I'm asking for something > obvious. > > Thanks a lot. > > Rafael Leira. > > Pd: Currently, we have a static standalone flink cluster. > >