Re: Use bloom filter to improve hybrid hash join performance

2015-06-18 Thread Stephan Ewen
Hi! That is a very nice idea and a well proven optimization to the hybrid hash join. It would be a great if you could contribute that. The memory allocated for the hash buckets (holding hash codes and pointers) is currently wasted for those buckets where the partition of the bucket is spilled. Pu

Use bloom filter to improve hybrid hash join performance

2015-06-18 Thread Li, Chengxiang
Hi, flink developers I read the flink hybrid hash join documents and implementation, very nice job. For the case of small table does not all fit into memory, I think we may able to improve the performance better. Currently in hybrid hash join, while small table does not fit into memory, part o