Re: Constraint's "not null" alignment with transactions and their simplification

Bernardo Botella Fri, 11 Apr 2025 08:49:35 -0700

Benedict:

An alternative for that, keeping the CHECK word, would be to change the 
constraint name to IS_JSON. CHECK IS_JSON would read as you intend without the 
need to jump to REQUIRE. I think that’s true for the rest of provided 
constraints as well.


Bernardo


> On Apr 11, 2025, at 6:02 AM, Benedict <bened...@apache.org> wrote:
> 
> We have taken a different approach though, as we do not actually take a 
> predicate on the RHS and do not supply the column name. In our examples we 
> had eg CHECK JSON, which doesn’t parse unambiguously to a human. The 
> equivalent to Postgres would seem to be CHECK is_json(field).
> 
> I’m all for following an existing example, but once we decide to diverge the 
> justification is gone and we should decide holistically what we think is 
> best. So if we want to elide the column entirely and have a list of built in 
> restrictions, I’d prefer eg REQUIRE JSON since this parses unambiguously to a 
> human, whereas if we want to follow Postgres let’s do that but do it but that 
> means eg CHECK is_json(field).
> 
>> On 11 Apr 2025, at 10:57, Štefan Miklošovič <smikloso...@apache.org> wrote:
>> 
>> 
>> While modelling that, we followed how it is done in SQL world, PostgreSQL as 
>> well as MySQL both use CHECK.
>> 
>> https://www.postgresql.org/docs/current/ddl-constraints.html#DDL-CONSTRAINTS-CHECK-CONSTRAINTS
>> https://dev.mysql.com/doc/refman/8.4/en/create-table-check-constraints.html
>> 
>> On Fri, Apr 11, 2025 at 10:43 AM Benedict <bened...@apache.org 
>> <mailto:bened...@apache.org>> wrote:
>>> I would prefer require/expect/is over check
>>> 
>>>> On 11 Apr 2025, at 08:05, Štefan Miklošovič <smikloso...@apache.org 
>>>> <mailto:smikloso...@apache.org>> wrote:
>>>> 
>>>> 
>>>> Yes, you will have it like that :) Thank you for this idea. Great example 
>>>> of cooperation over diverse domains.
>>>> 
>>>> On Fri, Apr 11, 2025 at 12:29 AM David Capwell <dcapw...@apple.com 
>>>> <mailto:dcapw...@apple.com>> wrote:
>>>>> I am biased but I do prefer
>>>>> 
>>>>> val3 text CHECK NOT NULL AND JSON AND LENGTH() < 1024
>>>>> 
>>>>> Here is a similar accord CQL
>>>>> 
>>>>> BEGIN TRANSACTION
>>>>>   LET a = (…);
>>>>>   IF a IS NOT NULL 
>>>>>       AND a.b IS NOT NULL 
>>>>>       AND a.c IS NULL; THEN
>>>>>     — profit
>>>>>   END IF
>>>>> COMMIT TRANSACTION
>>>>> 
>>>>>> On Apr 10, 2025, at 8:46 AM, Yifan Cai <yc25c...@gmail.com 
>>>>>> <mailto:yc25c...@gmail.com>> wrote:
>>>>>> 
>>>>>> Re: reserved keywords, “check” is currently not, and I don’t think it 
>>>>>> needs to be a reserved keyword with the proposal.
>>>>>> 
>>>>>> From: C. Scott Andreas <sc...@paradoxica.net 
>>>>>> <mailto:sc...@paradoxica.net>>
>>>>>> Sent: Thursday, April 10, 2025 7:59:35 AM
>>>>>> To: dev@cassandra.apache.org <mailto:dev@cassandra.apache.org> 
>>>>>> <dev@cassandra.apache.org <mailto:dev@cassandra.apache.org>>
>>>>>> Cc: dev@cassandra.apache.org <mailto:dev@cassandra.apache.org> 
>>>>>> <dev@cassandra.apache.org <mailto:dev@cassandra.apache.org>>
>>>>>> Subject: Re: Constraint's "not null" alignment with transactions and 
>>>>>> their simplification
>>>>>>  
>>>>>> If the proposal does not introduce “check” as a reserved keyword that 
>>>>>> would require quoting in existing DDL/DML, this concern doesn’t apply 
>>>>>> and the email below can be ignored. This might be the case if “CHECK NOT 
>>>>>> NULL” is the full token introduced rather than “CHECK” separately from 
>>>>>> constraints that are checked.
>>>>>> 
>>>>>> If “check” is introduced as a standalone reserved keyword: my primary 
>>>>>> feedback is on the introduction of reserved words in the CQL grammar 
>>>>>> that may affect compatibility of existing schemas.
>>>>>> 
>>>>>> In the Cassandra 3.x series, several new CQL reserved words were added 
>>>>>> (more than necessary) and subsequently backed out, because it required 
>>>>>> users to begin quoting schemas and introduced incompatibility between 
>>>>>> 3.x and 4.x for queries and DDL that “just worked” before.
>>>>>> 
>>>>>> The word “check” is used in many domains (test/evaluation engineering, 
>>>>>> finance, business processes, etc) and is likely to be used in user 
>>>>>> schemas. If the proposal introduces this as a reserved word that would 
>>>>>> require it to be quoted if used in table or column names, this will 
>>>>>> create incompatibility for existing user queries on upgrade.
>>>>>> 
>>>>>> Otherwise, ignore me. :)
>>>>>> 
>>>>>> Thanks,
>>>>>> 
>>>>>> – Scott
>>>>>> 
>>>>>> –––
>>>>>> Mobile
>>>>>> 
>>>>>>> On Apr 10, 2025, at 7:47 AM, Jon Haddad <j...@rustyrazorblade.com 
>>>>>>> <mailto:j...@rustyrazorblade.com>> wrote:
>>>>>>> 
>>>>>>> 
>>>>>>> This looks like a really nice improvement to me. 
>>>>>>> 
>>>>>>> 
>>>>>>> On Thu, Apr 10, 2025 at 7:27 AM Štefan Miklošovič 
>>>>>>> <smikloso...@apache.org <mailto:smikloso...@apache.org>> wrote:
>>>>>>> Recently, David Capwell was commenting on constraints in one of Slack 
>>>>>>> threads (1) in dev channel and he suggested that the current form of 
>>>>>>> "not null" constraint we have right now in place, e.g like this
>>>>>>> 
>>>>>>> create table ks.tb (id int primary key, val int check not_null(val));
>>>>>>> 
>>>>>>> could be instead of that form used like this:
>>>>>>> 
>>>>>>> create table ks.tb (id int primary key, val int check not null);
>>>>>>> 
>>>>>>> That is - without the name of a column in the constraint's argument. 
>>>>>>> The reasoning behind that was that it is not only easier to read but 
>>>>>>> there is also this concept in transactions (cep-15) where there is also 
>>>>>>> "not null" used in some fashion and it would be nice if this was 
>>>>>>> aligned so a user does not encounter two usages of "not null"-s which 
>>>>>>> are written down differently, syntax-wise.
>>>>>>> 
>>>>>>> Could the usage of "not null" in transactions be confirmed?
>>>>>>> 
>>>>>>> This rather innocent suggestion brought an idea to us that constraints 
>>>>>>> could be quite simplified when it comes to their syntax, consider this:
>>>>>>> 
>>>>>>> val int check not_null(val)
>>>>>>> val text check json(val)
>>>>>>> val text check lenght(val) < 1000
>>>>>>> 
>>>>>>> to be used like this:
>>>>>>> 
>>>>>>> val int check not null
>>>>>>> val text check json
>>>>>>> val text check length() < 1000
>>>>>>> 
>>>>>>> more involved checks like this:
>>>>>>> 
>>>>>>> val text check not_null(val) and json(val) and length(val) < 1000
>>>>>>> 
>>>>>>> might be just simplified to:
>>>>>>> 
>>>>>>> val text check not null and json and length() < 1000
>>>>>>> 
>>>>>>> It almost reads like plain English. Isn't this just easier for an eye?
>>>>>>> 
>>>>>>> The reason we kept the column names in constraint definitions is that, 
>>>>>>> frankly speaking, we just did not know any better at the time it was 
>>>>>>> about to be implemented. It is a little bit more tricky to be able to 
>>>>>>> use it without column names because in Parser.g / Antlr we just bound 
>>>>>>> the grammar around constraints to a column name directly there. When 
>>>>>>> column names are not going to be there anymore, we need to bind it 
>>>>>>> later in the code behind the parser in server code. It is doable, it 
>>>>>>> was just about being a little bit more involved there.
>>>>>>> 
>>>>>>> Also, one reason to keep the name of a column was that we might specify 
>>>>>>> different columns in a constraint from a column that is defined on to 
>>>>>>> have cross-column constraints but we abandoned this idea altogether for 
>>>>>>> other reasons which rendered the occurrence of a column name in a 
>>>>>>> constraint definition redundant.
>>>>>>> 
>>>>>>> To have some overview of what would be possible to do with this 
>>>>>>> proposal:
>>>>>>> 
>>>>>>> val3 text CHECK SOMECONSTRAINT('a');
>>>>>>> val3 text CHECK JSON;
>>>>>>> val3 text CHECK SOMECONSTRAINT('a') > 1;
>>>>>>> val3 text CHECK SOMECONSTRAINT('a', 'b', 'c') > 1;
>>>>>>> val3 text CHECK JSON AND LENGTH() < 600;
>>>>>>> afternoon time CHECK afternoon >= '12:00:00' AND afternoon =< 
>>>>>>> '23:59:59';
>>>>>>> val3 text CHECK NOT NULL AND JSON AND LENGTH() < 1024
>>>>>>> 
>>>>>>> In addition to the specification of constraints without columns, what 
>>>>>>> would be possible to do is to also specify arguments to constraints. It 
>>>>>>> is currently not possible and there is no constraint which would accept 
>>>>>>> arguments to its function but I think that to be as flexible as 
>>>>>>> possible and prepare for the future, we might implement it as well. 
>>>>>>> 
>>>>>>> Constraints in their current form are already usable however I just 
>>>>>>> think that if we do not simplify, align and extend the syntax right 
>>>>>>> now, before it is baked in in a release, then we will never do it as it 
>>>>>>> will be quite tricky to extend this without breaking it and maintaining 
>>>>>>> two grammars at the same time would be very complex if not flat out 
>>>>>>> impossible.
>>>>>>> 
>>>>>>> Are you open to the simplification of constraint definitions as 
>>>>>>> suggested and what is your feedback about that? I already have a 
>>>>>>> working POC which just needs to be polished and tests fixed to 
>>>>>>> accommodate the new approach.
>>>>>>> 
>>>>>>> Regards
>>>>>>> 
>>>>>>> (1) https://the-asf.slack.com/archives/CK23JSY2K/p1742409054164389
>>>>>

Re: Constraint's "not null" alignment with transactions and their simplification

Reply via email to