The test data is generated in a flink program running in a separate jvm. The generated data is then written to a Kafka topic from which my programs reads the data ...
> Am 24.09.2015 um 14:53 schrieb Aljoscha Krettek <aljos...@apache.org>: > > Hi Rico, > are you generating the data directly in your flink program or some external > queue, such as Kafka? > > Cheers, > Aljoscha > >> On Thu, 24 Sep 2015 at 13:47 Rico Bergmann <i...@ricobergmann.de> wrote: >> And as side note: >> >> The problem with duplicates seems also to be solved! >> >> Cheers Rico. >> >> >> >>> Am 24.09.2015 um 12:21 schrieb Rico Bergmann <i...@ricobergmann.de>: >>> >>> I took a first glance. >>> >>> I ran 2 test setups. One with a limited test data generator, the outputs >>> around 200 events per second. In this setting the new implementation keeps >>> up with the incoming message rate. >>> >>> The other setup had an unlimited generation (at highest possible rate). >>> There the same problem as before can be observed. After 2 minutes runtime >>> the output of my program is more than a minute behind ... And increasing >>> over time. But I don't know whether this could be a setup problem. I >>> noticed the os load of my testsystem was around 90%. So it might be more a >>> setup problem ... >>> >>> Thanks for your support so far. >>> >>> Cheers. Rico. >>> >>> >>> >>> >>> >>>> Am 24.09.2015 um 09:33 schrieb Aljoscha Krettek <aljos...@apache.org>: >>>> >>>> Hi Rico, >>>> you should be able to get it with these steps: >>>> >>>> git clone https://github.com/StephanEwen/incubator-flink.git flink >>>> cd flink >>>> git checkout -t origin/windows >>>> >>>> This will get you on Stephan's windowing branch. Then you can do a >>>> >>>> mvn clean install -DskipTests >>>> >>>> to build it. >>>> >>>> I will merge his stuff later today, then you should also be able to use it >>>> by running the 0.10-SNAPSHOT version. >>>> >>>> Cheers, >>>> Aljoscha >>>> >>>> >>>>> On Thu, 24 Sep 2015 at 09:11 Rico Bergmann <i...@ricobergmann.de> wrote: >>>>> Hi! >>>>> >>>>> Sounds great. How can I get the source code before it's merged to the >>>>> master branch? Unfortunately I only have 2 days left for trying this out >>>>> ... >>>>> >>>>> Greets. Rico. >>>>> >>>>> >>>>> >>>>>> Am 24.09.2015 um 00:57 schrieb Stephan Ewen <se...@apache.org>: >>>>>> >>>>>> Hi Rico! >>>>>> >>>>>> We have finished the first part of the Window API reworks. You can find >>>>>> the code here: https://github.com/apache/flink/pull/1175 >>>>>> >>>>>> It should fix the issues and offer vastly improved performance (up to >>>>>> 50x faster). For now, it supports time windows, but we will support the >>>>>> other cases in the next days. >>>>>> >>>>>> I'll ping you once it is merged, I'd be curious if it fixes your issue. >>>>>> Sorry that you ran into this problem... >>>>>> >>>>>> Greetings, >>>>>> Stephan >>>>>> >>>>>> >>>>>>> On Mon, Sep 7, 2015 at 12:00 PM, Rico Bergmann <i...@ricobergmann.de> >>>>>>> wrote: >>>>>>> Hi! >>>>>>> >>>>>>> While working with grouping and windowing I encountered a strange >>>>>>> behavior. I'm doing: >>>>>>>> dataStream.groupBy(KeySelector).window(Time.of(x, >>>>>>>> TimeUnit.SECONDS)).mapWindow(toString).flatten() >>>>>>> >>>>>>> When I run the program containing this snippet it initially outputs >>>>>>> data at a rate around 150 events per sec. (That is roughly the input >>>>>>> rate for the program). After about 10-30 minutes the rate drops down >>>>>>> below 5 events per sec. This leads to event delivery offsets getting >>>>>>> bigger and bigger ... >>>>>>> >>>>>>> Any explanation for this? I know you are reworking the streaming API. >>>>>>> But it would be useful to know, why this happens ... >>>>>>> >>>>>>> Cheers. Rico.