Hi Wepngong, This is an interesting proposal. There are indeed many streaming optimisations out there but as Gyula said we should focus on a few and engineer them in a nice way. Perhaps for the time being it makes sense to focus on a streaming job graph optimiser that basically applies optimisations by statically analysing the graph before submitting it ie. query re-writing, operator reordering, operator sharing, intermediate result sharing etc. A runtime optimiser that can do things like load balancing and online reconfiguration would certainly be the next step.
cheers Paris On 20 Mar 2015, at 12:48, Gyula Fóra <gyf...@apache.org<mailto:gyf...@apache.org>> wrote: Hey, Of course the aim of the project would not be to implement all possible optimizations because that would be impossible to do so in such short time :) It would be nice if one could carefully select some optimizations that would make the most impact on the performance and implement those. Regards, Gyula On Fri, Mar 20, 2015 at 5:15 AM, Wepngong Benaiah <bwepng...@gmail.com<mailto:bwepng...@gmail.com>> wrote: hello , I have been making some research on https://issues.apache.org/jira/browse/FLINK-1617 using http://hirzels.com/martin/papers/csur14-streamopt.pdf and others. I find out that there are many optimization techniques available like 1. OPERATOR REORDERING 2. REDUNDANCY ELIMINATION 3. OPERATOR SEPARATION 4. FUSION 5. FISSION 6. LOAD BALANCING 7. STATE SHARING My question is do I need to choose 1 or 2 of the algorithms and implement for GSOC or im required to implement all the algorithms given the tight contraint for GSOC timeline Need help @Gyula Fora -- Wepngong Ngeh Benaiah "Black holes are where God divided by zero. "