For the windowing designs, we should also have in mind what requirements we have on the way we keep/store the elements (in external stores, Flink managed memory, ...)
On Tue, Jun 23, 2015 at 9:55 AM, Aljoscha Krettek <aljos...@apache.org> wrote: > The reason I posted this now is that we need to think about the API and > windowing before proceeding with the PRs of Gabor (inverse reduce) and > Gyula (removal of "aggregate" functions on DataStream). > > For the windowing, I think that the current model does not work for > out-of-order processing. Therefore, the whole windowing infrastructure will > basically have to be redone. Meaning also that any work on the > pre-aggregators or optimizations that we do now becomes useless. > > For the API, I proposed to restructure the interactions between all the > different *DataStream classes and grouping/windowing. (See API section of > the doc I posted.) > > On Mon, 22 Jun 2015 at 21:56 Gyula Fóra <gyula.f...@gmail.com> wrote: > > > Hi Aljoscha, > > > > Thanks for the nice summary, this is a very good initiative. > > > > I added some comments to the respective sections (where I didnt fully > agree > > :).). > > At some point I think it would be good to have a public hangout session > on > > this, which could make a more dynamic discussion. > > > > Cheers, > > Gyula > > > > Aljoscha Krettek <aljos...@apache.org> ezt írta (időpont: 2015. jún. > 22., > > H, 21:34): > > > > > Hi, > > > with people proposing changes to the streaming part I also wanted to > > throw > > > my hat into the ring. :D > > > > > > During the last few months, while I was getting acquainted with the > > > streaming system, I wrote down some thoughts I had about how things > could > > > be improved. Hopefully, they are in somewhat coherent shape now, so > > please > > > have a look if you are interested in this: > > > > > > > > > https://docs.google.com/document/d/1rSoHyhUhm2IE30o5tkR8GEetjFvMRMNxvsCfoPsW6_4/edit?usp=sharing > > > > > > This mostly covers: > > > - Timestamps assigned at sources > > > - Out-of-order processing of elements in window operators > > > - API design > > > > > > Please let me know what you think. Comment in the document or here in > the > > > mailing list. > > > > > > I have a PR in the makings that would introduce source timestamps and > > > watermarks for keeping track of them. I also hacked a proof-of-concept > > of a > > > windowing system that is able to process out-of-order elements using a > > > FlatMap operator. (It uses panes to perform efficient > pre-aggregations.) > > > > > > Cheers, > > > Aljoscha > > > > > >