Re: Counting distinct values for a key?

Jerry Lam Sun, 19 Jul 2015 14:29:01 -0700

You mean this does not work?

SELECT key, count(value) from table group by key




On Sun, Jul 19, 2015 at 2:28 PM, N B <nb.nos...@gmail.com> wrote:

> Hello,
>
> How do I go about performing the equivalent of the following SQL clause in
> Spark Streaming? I will be using this on a Windowed DStream.
>
> SELECT key, count(distinct(value)) from table group by key;
>
> so for example, given the following dataset in the table:
>
>  key | value
> -----+-------
>  k1  | v1
>  k1  | v1
>  k1  | v2
>  k1  | v3
>  k1  | v3
>  k2  | vv1
>  k2  | vv1
>  k2  | vv2
>  k2  | vv2
>  k2  | vv2
>  k3  | vvv1
>  k3  | vvv1
>
> the result will be:
>
>  key | count
> -----+-------
>  k1  |     3
>  k2  |     2
>  k3  |     1
>
> Thanks
> Nikunj
>
>

Re: Counting distinct values for a key?

Reply via email to