Re: stream clustering in flink

2017-02-10 Thread Aljoscha Krettek
Hi, I think distributed stream clustering is still a somewhat open field. I'm not aware of popular open source systems that have implementations for that (except maybe Apache SAMOA). Maybe you will have some luck if you try to search for "distributed stream clustering" papers. Cheers, Aljoscha On

stream clustering in flink

2017-02-06 Thread Jan Nehring
Hi, we want to cluster a stream of Tweets using Flink. Every incoming tweet is compared to the last 100 tweets. After this comparison, a cluster ID is assigned to the tweet. We try to find out the best approach how to solve this: 1. Using a stream window of the last tweets seems to be diffic