Re: Count(distinct) not working in beam sql

2023-11-11 Thread Talat Uyarer via user
Hi, I saw this a little bit late. I implement a custom count distinct for our streaming use case. If you are looking for something close enough but not exact you can use my UDF. It uses the HyperLogLogPlus algorithm, which is an efficient and scalable way to estimate cardinality with a controlled

Re: Count(distinct) not working in beam sql

2023-11-03 Thread Alexey Romanenko
Unfortunatelly, Beam SQL doesn’t support COUNT(DISTINCT) aggregation. More details about “why" is on this discussion [1] and the related open issue for that here [2]. — Alexey [1] https://lists.apache.org/thread/hvmy6d5dls3m8xcnf74hfmy1xxfgj2xh [2] https://github.com/apache/beam/issues/19398