+1 (non-binding) Thanks for working on this Anton! Some links to other engines that also did something similar:
HIVE-13076 - https://issues.apache.org/jira/browse/HIVE-13076 IMPALA-3531 - https://issues.apache.org/jira/browse/IMPALA-3531 In fact, Spark had a very old Jira SPARK-19842 - https://issues.apache.org/jira/browse/SPARK-19842 Thanks Anurag Mantripragada > On Mar 22, 2025, at 5:36 AM, Yuming Wang <yumw...@apache.org> wrote: > > +1 > > On Sat, Mar 22, 2025 at 7:01 PM Peter Toth <peter.t...@gmail.com > <mailto:peter.t...@gmail.com>> wrote: >> +1 >> >> On Fri, Mar 21, 2025 at 10:24 PM Szehon Ho <szehon.apa...@gmail.com >> <mailto:szehon.apa...@gmail.com>> wrote: >>> +1 (non binding) >>> >>> Agree with Anton, data sources like the open table formats define the >>> requirement, and definitely need engines to write to it accordingly. >>> >>> Thanks, >>> Szehon >>> >>> On Fri, Mar 21, 2025 at 1:31 PM Anton Okolnychyi <aokolnyc...@gmail.com >>> <mailto:aokolnyc...@gmail.com>> wrote: >>>>> -1 (non-binding): Breaks the Chain of Responsibility. Constraints should >>>>> be defined and enforced by the data sources themselves, not Spark. Spark >>>>> is a processing engine, and enforcing constraints at this level blurs >>>>> architectural boundaries, making Spark responsible for something it does >>>>> not control. >>>> >>>> I disagree that this breaks the chain of responsibility. It may be quite >>>> the opposite, in fact. Spark is already responsible for enforcing NOT NULL >>>> constraints by adding AssertNotNull for required columns today. Connectors >>>> like Iceberg and Delta store constraint definitions but rely on engines >>>> like Spark to enforce them during INSERT, DELETE, UPDATE, and MERGE >>>> operations. Without this API, each connector would need to reimplement the >>>> same logic, creating duplication. >>>> >>>> The proposal is aligned with the SQL standard and other relational >>>> databases. In my view, it simply makes Spark a better engine, facilitates >>>> data accuracy and consistency, and enables performance optimizations. >>>> >>>> - Anton >>>> >>>> пт, 21 бер. 2025 р. о 12:59 Ángel Álvarez Pascua >>>> <angel.alvarez.pas...@gmail.com <mailto:angel.alvarez.pas...@gmail.com>> >>>> пише: >>>>> -1 (non-binding): Breaks the Chain of Responsibility. Constraints should >>>>> be defined and enforced by the data sources themselves, not Spark. Spark >>>>> is a processing engine, and enforcing constraints at this level blurs >>>>> architectural boundaries, making Spark responsible for something it does >>>>> not control. >>>>> >>>>> El vie, 21 mar 2025 a las 20:18, L. C. Hsieh (<vii...@gmail.com >>>>> <mailto:vii...@gmail.com>>) escribió: >>>>>> +1 >>>>>> >>>>>> On Fri, Mar 21, 2025 at 12:13 PM huaxin gao <huaxin.ga...@gmail.com >>>>>> <mailto:huaxin.ga...@gmail.com>> wrote: >>>>>> > >>>>>> > +1 >>>>>> > >>>>>> > On Fri, Mar 21, 2025 at 12:08 PM Denny Lee <denny.g....@gmail.com >>>>>> > <mailto:denny.g....@gmail.com>> wrote: >>>>>> >> >>>>>> >> +1 (non-binding) >>>>>> >> >>>>>> >> On Fri, Mar 21, 2025 at 11:52 Gengliang Wang <ltn...@gmail.com >>>>>> >> <mailto:ltn...@gmail.com>> wrote: >>>>>> >>> >>>>>> >>> +1 >>>>>> >>> >>>>>> >>> On Fri, Mar 21, 2025 at 11:46 AM Anton Okolnychyi >>>>>> >>> <aokolnyc...@gmail.com <mailto:aokolnyc...@gmail.com>> wrote: >>>>>> >>>> >>>>>> >>>> Hi all, >>>>>> >>>> >>>>>> >>>> I would like to start a vote on adding support for constraints to >>>>>> >>>> DSv2. >>>>>> >>>> >>>>>> >>>> Discussion thread: >>>>>> >>>> https://lists.apache.org/thread/njqjcryq0lot9rkbf10mtvf7d1t602bj >>>>>> >>>> SPIP: >>>>>> >>>> https://docs.google.com/document/d/1EHjB4W1LjiXxsK_G7067j9pPX0y15LUF1Z5DlUPoPIo >>>>>> >>>> PR with the API changes: https://github.com/apache/spark/pull/50253 >>>>>> >>>> JIRA: https://issues.apache.org/jira/browse/SPARK-51207 >>>>>> >>>> >>>>>> >>>> Please vote on the SPIP for the next 72 hours: >>>>>> >>>> >>>>>> >>>> [ ] +1: Accept the proposal as an official SPIP >>>>>> >>>> [ ] +0 >>>>>> >>>> [ ] -1: I don’t think this is a good idea because … >>>>>> >>>> >>>>>> >>>> - Anton >>>>>> >>>>>> --------------------------------------------------------------------- >>>>>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org >>>>>> <mailto:dev-unsubscr...@spark.apache.org> >>>>>>