Hello,

I'd like to write a job that will basically use EXCEPT [1] between 2 tables
that will have roughly the same size. I see in the documentation that there
are special optimizations like mini-batches for operations like joins, but
it doesn't specify if EXCEPT is basically a join that discards matches
between the 2 tables and then emits what remains (which I guess could be
optimized), or if it performs something like lookups. Could anyone clarify
what needs to be taken into account here, in particular regarding memory
and whether the data might be spilled to disk in certain cases?

[1]
https://nightlies.apache.org/flink/flink-docs-release-2.1/docs/dev/table/sql/queries/set-ops/#except

Regards,
Alexis.

Reply via email to