I took an initial look at the PRs this morning and I’ll go through the design doc in more detail but I think these features look great. It’s especially important with the CA regulation changes to make this easier for folks to implement.
On Thu, Jun 24, 2021 at 4:54 PM Anton Okolnychyi <aokolnyc...@gmail.com> wrote: > Hey everyone, > > I'd like to start a discussion on adding support for executing row-level > operations such as DELETE, UPDATE, MERGE for v2 tables (SPARK-35801). The > execution should be the same across data sources and the best way to do > that is to implement it in Spark. > > Right now, Spark can only parse and to some extent analyze DELETE, UPDATE, > MERGE commands. Data sources that support row-level changes have to build > custom Spark extensions to execute such statements. The goal of this effort > is to come up with a flexible and easy-to-use API that will work across > data sources. > > Design doc: > > https://docs.google.com/document/d/12Ywmc47j3l2WF4anG5vL4qlrhT2OKigb7_EbIKhxg60/ > > PR for handling DELETE statements: > https://github.com/apache/spark/pull/33008 > > Any feedback is more than welcome. > > Liang-Chi was kind enough to shepherd this effort. Thanks! > > - Anton > > > > > > -- Twitter: https://twitter.com/holdenkarau Books (Learning Spark, High Performance Spark, etc.): https://amzn.to/2MaRAG9 <https://amzn.to/2MaRAG9> YouTube Live Streams: https://www.youtube.com/user/holdenkarau