Think bloom filter that's more dynamic. It works well when cardinality is
low, but grows quickly to out cost bloom filter as cardinality grows.
This data structure supports existence queries, but your email sounds like
you want count. If so not really the best fit.
On Dec 8, 2017 5:00 PM, "Niti
Hi all,
I'm working on speeding up distinct count calculations, and it looks like
roaring bitmaps (RB) is the newest and meanest way for set operations. Anyone
here have experience with them? How was the performance compared to hyperloglog
and EWAH? A quick google search showed me that it's easier
Hi,
I’ve been struggling with this for a few hours, hopefully somebody here can
help me out.
We have a lot of data in parquet format on S3 and we want to use Hive to query
it. I’m running on ubuntu and we have a MySQL metadata store on AWS RDS.
The command in the hive client I’m trying to run