Forward shiming mail to Aitozi. Aitozi
We are using hyperloglog to count daily uv, but it only provided an approximate value. I also tried the count distinct in flink table without window, but need to set the retention time. However, the time resolution of this operator is 1 millisecond, so it ends up with too many timers in the java heap which might leads to OOM. Cheers Shimin > 在 2018年6月27日,下午5:34,zhangminglei <18717838...@163.com> 写道: > > Aitozi > > From my side, I do not think distinct is very easy to deal with. Even though > together work with kafka support exactly-once. > > For uv, we can use a bloomfilter to filter pv for geting uv in the end. > > Window is usually used in an aggregate operation, so I think all should be > realized by windows. > > I am not familiar with this fields, so I still want to know what others > response this question. > > Cheers > Minglei > > > >> 在 2018年6月27日,下午5:12,aitozi <gjying1...@gmail.com> 写道: >> >> Hi, community >> >> I am using flink to deal with some situation. >> >> 1. "distinct count" to calculate the uv/pv. >> 2. calculate the topN of the past 1 hour or 1 day time. >> >> Are these all realized by window? Or is there a best practice on doing this? >> >> 3. And when deal with the distinct, if there is no need to do the keyBy >> previous, how does the window deal with this. >> >> Thanks >> Aitozi. >> >> >> >> -- >> Sent from: >> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/ >