Thank you for your reply. I actually found your blog post regarding this
topic and browsed through it, but it did not yield the answer I was looking
for. In fact, it seems impossible to do what I wish to do without defining
a UDA for this specific use case -- something that is not practical to do
when all of my queries use 'group by'.

For example, I ave a query like this:

select sum(a), avg(a), min(a), max(a), MY_UDF(my_set_column) from my_table
group by a;

I would hope that using a UDF for my_set_column would allow me to combine
all of the my_set_columns passed in via group by, but I cannot pass state
to the UDF. A UDA can accept state, but that would require me rewriting the
whole query to be:

select MY_UDA(a, my_set_column) from my_table;

Additionally, I would need a separate UDA for each of the different group
by clauses. Is there no way around this? I would really like to be able to
simply add a data column of type set<bigint> and then get all of the unique
members in this set across an aggregation.

On Tue, Oct 23, 2018 at 1:44 PM DuyHai Doan <doanduy...@gmail.com> wrote:

> You will need to use user defined aggregates for this
>
> Le 23 oct. 2018 16:46, "Joseph Wonesh" <joseph.won...@sticknfind.com> a
> écrit :
>
>> Hello all,
>>
>>  I am trying to aggregate rows which each contain a column of
>> Set<BIGINT>. I would like the result to contain the sum of all sets, where
>> null would be equivalent to the empty set. I expected a query like: "select
>> sum(my_set_column) from my_table group by my_key_column" to do this, but
>> the set type is not supported by this aggregate. Does anyone know of a way
>> to aggregate this using existing cassandra built-ins? Thanks!
>>
>> This message is private and confidential. If you have received message in
>> error, please notify us and remove from your system.
>
>

-- 
This message is private and confidential. If you have received message in 
error, please notify us and remove from your system. 

Reply via email to