You mean this does not work? SELECT key, count(value) from table group by key
On Sun, Jul 19, 2015 at 2:28 PM, N B <nb.nos...@gmail.com> wrote: > Hello, > > How do I go about performing the equivalent of the following SQL clause in > Spark Streaming? I will be using this on a Windowed DStream. > > SELECT key, count(distinct(value)) from table group by key; > > so for example, given the following dataset in the table: > > key | value > -----+------- > k1 | v1 > k1 | v1 > k1 | v2 > k1 | v3 > k1 | v3 > k2 | vv1 > k2 | vv1 > k2 | vv2 > k2 | vv2 > k2 | vv2 > k3 | vvv1 > k3 | vvv1 > > the result will be: > > key | count > -----+------- > k1 | 3 > k2 | 2 > k3 | 1 > > Thanks > Nikunj > >