The SQL UNION is the reason here that's causing (a) the table is not append
only, and (b) the inner GroupBy.

If you check out the UNION operator[1], it suggests that: "Any duplicate
records are automatically removed unless UNION ALL is used".
So: (1) it is definitely not append-only operation as you need to revision
when duplicate records are generated. and
(2) I think Calcite optimizer is translating the entire execution into two
individual projection operations followed by a all column GroupBy to dedup
the messages.

Thanks,
Rong

Reference:
[1] https://en.wikipedia.org/wiki/Set_operations_(SQL)#UNION_operator

On Tue, May 22, 2018 at 2:52 PM, Gregory Fee <g...@lyft.com> wrote:

> I'm trying to get a stream of data from a Table I've formed with roughly
> this SQL:
>
> SELECT
>     user_id,
>     count(msg),
>     HOP_END(rowtime, INTERVAL '1' second, INTERVAL '1' minute)
> FROM (SELECT rowtime, user_id, action_name AS msg FROM
>           event_client_action
>         WHERE /* various clauses */
>         UNION SELECT rowtime, user_id, action_type AS msg FROM
>            event_server_action
>            WHERE /* various clauses */
>       )
> GROUP BY
> HOP(rowtime, INTERVAL '1' second, INTERVAL '1' minute), user_id
>
> If I try to get an append stream it tells me the table is not append only.
> If I try to get a retract stream it tells me:
>
> Retraction on windowed GroupBy aggregation is not supported yet. Note:
> Windowed GroupBy aggregation should not follow a non-windowed GroupBy
> aggregation.
>
> Note: The same query without the union clause works just fine.
>
> The error message doesn't make sense to me because I do not think I'm
> doing a non-windowed GroupBy anywhere. Can anyone help me?
>
> --
> Gregory Fee
> Engineer
>

Reply via email to