Re: FrequentItems in spark-sql-execution-stat

2015-08-01 Thread Burak Yavuz
Hi Yucheng, Thanks for pointing out the issue. You are correct, in the case that the final map is completely empty after the merge, we do need to add the final element to the map, with the correct count (decrement the count with the max count that was already in the map). I'll submit a fix for it.

Re: FrequentItems in spark-sql-execution-stat

2015-07-31 Thread Koert Kuipers
this looks like a mistake in FrequentItems to me. if the map is full (map.size==size) then it should still add the new item (after removing items from the map and decrementing counts). if its not a mistake then at least it looks to me like the algo is different than described in the paper. is this