+1 (non-binding)

Thanks for working on this Anton! Some links to other engines that also did 
something similar:

HIVE-13076 - https://issues.apache.org/jira/browse/HIVE-13076
IMPALA-3531 - https://issues.apache.org/jira/browse/IMPALA-3531

In fact, Spark had a very old Jira
SPARK-19842 - https://issues.apache.org/jira/browse/SPARK-19842

Thanks
Anurag Mantripragada


> On Mar 22, 2025, at 5:36 AM, Yuming Wang <yumw...@apache.org> wrote:
> 
> +1
> 
> On Sat, Mar 22, 2025 at 7:01 PM Peter Toth <peter.t...@gmail.com 
> <mailto:peter.t...@gmail.com>> wrote:
>> +1
>> 
>> On Fri, Mar 21, 2025 at 10:24 PM Szehon Ho <szehon.apa...@gmail.com 
>> <mailto:szehon.apa...@gmail.com>> wrote:
>>> +1 (non binding)
>>> 
>>> Agree with Anton, data sources like the open table formats define the 
>>> requirement, and definitely need engines to write to it accordingly.
>>> 
>>> Thanks,
>>> Szehon
>>> 
>>> On Fri, Mar 21, 2025 at 1:31 PM Anton Okolnychyi <aokolnyc...@gmail.com 
>>> <mailto:aokolnyc...@gmail.com>> wrote:
>>>>> -1 (non-binding): Breaks the Chain of Responsibility. Constraints should 
>>>>> be defined and enforced by the data sources themselves, not Spark. Spark 
>>>>> is a processing engine, and enforcing constraints at this level blurs 
>>>>> architectural boundaries, making Spark responsible for something it does 
>>>>> not control.
>>>> 
>>>> I disagree that this breaks the chain of responsibility. It may be quite 
>>>> the opposite, in fact. Spark is already responsible for enforcing NOT NULL 
>>>> constraints by adding AssertNotNull for required columns today. Connectors 
>>>> like Iceberg and Delta store constraint definitions but rely on engines 
>>>> like Spark to enforce them during INSERT, DELETE, UPDATE, and MERGE 
>>>> operations. Without this API, each connector would need to reimplement the 
>>>> same logic, creating duplication.
>>>> 
>>>> The proposal is aligned with the SQL standard and other relational 
>>>> databases. In my view, it simply makes Spark a better engine, facilitates 
>>>> data accuracy and consistency, and enables performance optimizations.
>>>> 
>>>> - Anton
>>>> 
>>>> пт, 21 бер. 2025 р. о 12:59 Ángel Álvarez Pascua 
>>>> <angel.alvarez.pas...@gmail.com <mailto:angel.alvarez.pas...@gmail.com>> 
>>>> пише:
>>>>> -1 (non-binding): Breaks the Chain of Responsibility. Constraints should 
>>>>> be defined and enforced by the data sources themselves, not Spark. Spark 
>>>>> is a processing engine, and enforcing constraints at this level blurs 
>>>>> architectural boundaries, making Spark responsible for something it does 
>>>>> not control.
>>>>> 
>>>>> El vie, 21 mar 2025 a las 20:18, L. C. Hsieh (<vii...@gmail.com 
>>>>> <mailto:vii...@gmail.com>>) escribió:
>>>>>> +1
>>>>>> 
>>>>>> On Fri, Mar 21, 2025 at 12:13 PM huaxin gao <huaxin.ga...@gmail.com 
>>>>>> <mailto:huaxin.ga...@gmail.com>> wrote:
>>>>>> >
>>>>>> > +1
>>>>>> >
>>>>>> > On Fri, Mar 21, 2025 at 12:08 PM Denny Lee <denny.g....@gmail.com 
>>>>>> > <mailto:denny.g....@gmail.com>> wrote:
>>>>>> >>
>>>>>> >> +1 (non-binding)
>>>>>> >>
>>>>>> >> On Fri, Mar 21, 2025 at 11:52 Gengliang Wang <ltn...@gmail.com 
>>>>>> >> <mailto:ltn...@gmail.com>> wrote:
>>>>>> >>>
>>>>>> >>> +1
>>>>>> >>>
>>>>>> >>> On Fri, Mar 21, 2025 at 11:46 AM Anton Okolnychyi 
>>>>>> >>> <aokolnyc...@gmail.com <mailto:aokolnyc...@gmail.com>> wrote:
>>>>>> >>>>
>>>>>> >>>> Hi all,
>>>>>> >>>>
>>>>>> >>>> I would like to start a vote on adding support for constraints to 
>>>>>> >>>> DSv2.
>>>>>> >>>>
>>>>>> >>>> Discussion thread: 
>>>>>> >>>> https://lists.apache.org/thread/njqjcryq0lot9rkbf10mtvf7d1t602bj
>>>>>> >>>> SPIP: 
>>>>>> >>>> https://docs.google.com/document/d/1EHjB4W1LjiXxsK_G7067j9pPX0y15LUF1Z5DlUPoPIo
>>>>>> >>>> PR with the API changes: https://github.com/apache/spark/pull/50253
>>>>>> >>>> JIRA: https://issues.apache.org/jira/browse/SPARK-51207
>>>>>> >>>>
>>>>>> >>>> Please vote on the SPIP for the next 72 hours:
>>>>>> >>>>
>>>>>> >>>> [ ] +1: Accept the proposal as an official SPIP
>>>>>> >>>> [ ] +0
>>>>>> >>>> [ ] -1: I don’t think this is a good idea because …
>>>>>> >>>>
>>>>>> >>>> - Anton
>>>>>> 
>>>>>> ---------------------------------------------------------------------
>>>>>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org 
>>>>>> <mailto:dev-unsubscr...@spark.apache.org>
>>>>>> 

Reply via email to