Hi Dan,

Which Flink version are you using? I know that there has been quite a bit
of optimization of deduplication in 1.12, which would reduce the required
state tremendously.
I'm pulling in Jark who knows more.

On Thu, Dec 31, 2020 at 6:54 AM Dan Hill <quietgol...@gmail.com> wrote:

> Hi!
>
> I'm using Flink SQL to do an interval join.  Rows in one of the tables are
> not unique.  I'm fine using either the first or last row.  When I try to
> deduplicate
> <https://ci.apache.org/projects/flink/flink-docs-release-1.12/dev/table/sql/queries.html#deduplication>
>  and
> then interval join, I get the following error.
>
> IntervalJoin doesn't support consuming update and delete changes which is
> produced by node Rank(strategy=[UndefinedStrategy], rankType=[ROW_NUMBER],
> rankRange=[rankStart=1, rankEnd=1], partitionBy=[log_user_id], orderBy=[ts
> ASC], select=[platform_id, user_id, log_user_id, client_log_ts,
> event_api_ts, ts])
>
> Is there a way to combine these in this order?  I could do the
> deduplication afterwards but this will result in more state.
>
> - Dan
>


-- 

Arvid Heise | Senior Java Developer

<https://www.ververica.com/>

Follow us @VervericaData

--

Join Flink Forward <https://flink-forward.org/> - The Apache Flink
Conference

Stream Processing | Event Driven | Real Time

--

Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany

--
Ververica GmbH
Registered at Amtsgericht Charlottenburg: HRB 158244 B
Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji
(Toni) Cheng

Reply via email to