Hi Dan, Which Flink version are you using? I know that there has been quite a bit of optimization of deduplication in 1.12, which would reduce the required state tremendously. I'm pulling in Jark who knows more.
On Thu, Dec 31, 2020 at 6:54 AM Dan Hill <quietgol...@gmail.com> wrote: > Hi! > > I'm using Flink SQL to do an interval join. Rows in one of the tables are > not unique. I'm fine using either the first or last row. When I try to > deduplicate > <https://ci.apache.org/projects/flink/flink-docs-release-1.12/dev/table/sql/queries.html#deduplication> > and > then interval join, I get the following error. > > IntervalJoin doesn't support consuming update and delete changes which is > produced by node Rank(strategy=[UndefinedStrategy], rankType=[ROW_NUMBER], > rankRange=[rankStart=1, rankEnd=1], partitionBy=[log_user_id], orderBy=[ts > ASC], select=[platform_id, user_id, log_user_id, client_log_ts, > event_api_ts, ts]) > > Is there a way to combine these in this order? I could do the > deduplication afterwards but this will result in more state. > > - Dan > -- Arvid Heise | Senior Java Developer <https://www.ververica.com/> Follow us @VervericaData -- Join Flink Forward <https://flink-forward.org/> - The Apache Flink Conference Stream Processing | Event Driven | Real Time -- Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany -- Ververica GmbH Registered at Amtsgericht Charlottenburg: HRB 158244 B Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji (Toni) Cheng