-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi,
Aljoscha originally wanted to discuss whether it is necessary <quote> to make information about the event time of an element and information about windows in which it resides accessible to the user. </quote> I guess, we all agree that this information is necessary to timestamp the results and that timestamping the result is important. Don't we? Cheers, Bruno On 13.05.2015 11:45, Stephan Ewen wrote: > Okay, I thought that this thread is about how to make timestamps > and window information of a record accessible to the user code. > This involves how represent the information in which window a > record is, whether to attach it to that records, etc > > The semantics of what happens if you have multiple windows after > another (on time and other metrics) is orthogonal to that, as far > as I can tell. This deals with the question how and if timestamps > are re-assigned, how punctuations and watermarks propagate... > > On Wed, May 13, 2015 at 11:40 AM, Gyula Fóra <gyf...@apache.org> > wrote: > >> I think this is that thread :) >> >> But as I said it is just a matter of what we want to add, and we >> can already do it. >> >> On Wed, May 13, 2015 at 11:37 AM, Stephan Ewen <se...@apache.org> >> wrote: >> >>> This is a pretty central question, actually (timestamping the >>> results of windows). Let us kick off a separate thread for >>> this... >>> >>> On Wed, May 13, 2015 at 9:20 AM, Bruno Cadonna < >>> cado...@informatik.hu-berlin.de> wrote: >>> > Hi, > > I think timestamping results of a time window operator is > essential. Without timestamps in the results, it is not possible to > execute two time window operators one after the other. > > Cheers, Bruno > > On 12.05.2015 18:30, Aljoscha Krettek wrote: >>>>>> Hi, I'll try to make it quick this time. I think we need >>>>>> to make information about the event time of an element >>>>>> and information about windows in which it resides >>>>>> accessible to the user. A simple example would be the >>>>>> aggregation of some user behaviour, for example: >>>>>> >>>>>> in = clickSource() >>>>>> >>>>>> analysedData = in .window(10 minutes).every(5 minutes) >>>>>> .groupBy("userId") .filter(is something interesting) >>>>>> .sum("something") >>>>>> >>>>>> analysedData.storeToMySystem() >>>>>> >>>>>> Now the results of this window aggregation tell me that >>>>>> at some point, there was some window and in this window >>>>>> some attribute summed up to this. This might not be very >>>>>> helpful. What might be helpful is the information that >>>>>> there occurred a spike in something at 12:45 on >>>>>> Wednesday. Therefore, I think we need to make this >>>>>> information available somehow. >>>>>> >>>>>> I only have some rough Ideas about how this might work, >>>>>> but I would first like to discuss whether others even >>>>>> think this necessary. So fire away... >>>>>> >>>>>> Cheers, Aljoscha >>>>>> > >>>> >>> >> > - -- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Dr. Bruno Cadonna Postdoctoral Researcher Databases and Information Systems Department of Computer Science Humboldt-Universität zu Berlin http://www.informatik.hu-berlin.de/~cadonnab ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) iQEcBAEBAgAGBQJVUzbZAAoJEKdCIJx7flKwwEAH/jrhs2h2Ii1cdRWO6E0/h86Y fHztPR7aYaNjNP9ya7TnYnCDyeAvgOITYEJ5DyjpbXu1yaqJ4ZKi8HF5EZQpamRe ZGA/3gTLf+JGtI9swTHCPIVFh6BMTJ5R1L8lx4SplqHCjgqulT+TrpCEKyP1t9BA cTpL6NjtqEXBOk9pPg3IkBwTG3IPR3IlOyxIjIDsYw8O75Csp/Gb9k7QWzlN8g9B jxzOAyi5ASDBJtvJVupzSkAT/HY9ugo6+kCJGp+BWFqxyfnl0u0tVYE22r1INWjL x+qPAvF0EePMDYFeOXpRAqPnz6/R8Z8QidhsEGjiVqjctUyiXxgJYLN3RL/DYqI= =d4Zq -----END PGP SIGNATURE-----