Hey everyone, I'd like to start a discussion on adding support for executing row-level operations such as DELETE, UPDATE, MERGE for v2 tables (SPARK-35801). The execution should be the same across data sources and the best way to do that is to implement it in Spark.
Right now, Spark can only parse and to some extent analyze DELETE, UPDATE, MERGE commands. Data sources that support row-level changes have to build custom Spark extensions to execute such statements. The goal of this effort is to come up with a flexible and easy-to-use API that will work across data sources. Design doc: https://docs.google.com/document/d/12Ywmc47j3l2WF4anG5vL4qlrhT2OKigb7_EbIKhxg60/ PR for handling DELETE statements: https://github.com/apache/spark/pull/33008 Any feedback is more than welcome. Liang-Chi was kind enough to shepherd this effort. Thanks! - Anton