Hey, Some initial feedback from side:
I think this a very important problem to deal with as a lot of applications depend on it. I like the proposed runtime model and that is probably the good way to handle this task, it is very clean what is happening. My main concern is how to handle this from the API and UDFs. What you proposed seems like a very internal thing from the API perspective and I would be against exposing it in the way you wrote in your example. We should make all effort to streamline this with the functional style operators in some way. (so in that sense the way broadcastsets are handled is pretty nice) Maybe we could extend ds.connect() to many streams But in any case this is awesome initiative :) Cheers, Gyula Aljoscha Krettek <aljos...@apache.org> ezt írta (időpont: 2016. ápr. 21., Cs, 15:56): > Hi Team, > I'm currently thinking about how we can bring the broadcast set/broadcast > input feature form the DataSet API to the DataStream API. I think this > would be a valuable addition since it would enable use cases that join > streams with constant (or slowly changing) side information. > > For this purpose, I think that we need to change the way we handle stream > operators. The new model would have one unified operator that handles all > cases and allows to add inputs after the operator was constructed, thus > allowing the specification of broadcast inputs. > > I wrote up this preliminary document detailing the reason why we need such > a new operator for broadcast inputs and also what the API of such an > operator would be. It also quickly touches on the required changes of > existing per-operation stream operations such as StreamMap: > > > https://docs.google.com/document/d/1ZFzL_0xGuUEnBsFyEiHwWcmCcjhd9ArWsmhrgnt05RI/edit?usp=sharing > > Please have a look if you're interested. Feedback/insights are very > welcome. :-) > > Cheers, > Aljoscha >