Thanks everyone for all the feedback that came in after the call for votes.
To Yifan's point, yes you are right, and I updated the CEP with the expressions.
There’s been a really good discussion around adding or supporting constraints
at read time. I think the point Doug made illustrate that suc
The read time constraint application is going to be expensive and possibly
complicated to implement with low RoI. Therefore my suggestion is to defer
it. If there are situations where it appears to be helpful, we can always
reconsider it.
On Tue, Jun 25, 2024 at 3:34 PM Yifan Cai wrote:
> - Alte
>
> - Alter and Drop constraints are as follows
> ALTER CONSTRAINT [name] CHECK new_condition DROP CONSTRAINT [name]
>
I think you mean the following syntax to modify existing constraints, since
constraints are part of the table definition.
ALTER TABLE [keyspace_name.]table_name ALTER CONSTRAINT [
I wonder how often it is that users will apply the constraints on tables
with data while they know their data is probably not compliant with the
constraint configuration. I humbly think that people are aware of this in
advance and what usually happens is that there is some kind of a job which
conso
Abe, that's a good point. We need to call out distinct use-cases here. When
a fresh cluster is set up with constraints we don't have any issues because
the data written and read back is going to be compliant to the
constraint(s). For existing data in a cluster where new constraints are
applied or e
On the Analytics side, as long as the CQLSSTableWriter understands and enforces
the constraints (which it should be able to , given we provide the table
schema) we should be good to go. We should try hard to avoid scanning the data
on import, as the Analytics library does a bunch of things to pu
If we're going to introduce a feature that looks like SQL constraints, we
should make sure it's "reasonably" compliant. In particular, we should avoid
situations where a user creates a constraint, writes some data, then reads data
that violates that constraint, unless they've expressed that viol
On Tue, Jun 25, 2024 at 10:59 AM Josh McKenzie wrote:
>
> My intuition is the vote got called a *smidge* early but that things are
> very much moving in the right direction and are very close.
>
Agreed and the vote thread got us more feedback which is valuable :)
> I was referring to the name guardrail, using the same infra as guardrails
Curious if there's a subtle distinction implicit in this (or just in my
brain...). A guardrail is something one person puts in place for someone else -
in our case operators to users. Constraints are something inherent to
Hi Ariel,
Your suggestions make sense, and I’ll be updating the CEP with the details.
Basically:
- We have an optional name for the constraints. If the name is not provided, a
random name is generated for a constraint:
CREATE TABLE keyspace.table (
p1 int,
p2 int,
...,
CONSTRAINT [name]
+1 on Doug's suggestion. The operator sets a limit that application
developers should not be allowed to violate. This is precisely the type of
safety that we should strive for.
To Jordan's point, I also agree that the read before write type of
constraints should be avoided but if there is a very g
Hi,
I am also +1 on Doug's distinction between things that can be managed by
operators and things that can be managed by applications.
Some things to note about the syntax is that there are parens around the
condition in SQL. In your example there are multiple anonymous constraints on
the same
Got it. Thanks for the clarification Jon. Then, in terms of syntax, I think we
can discard the option 2.
In terms of GUARDRAIL vs CONSTRAINT concept you bring up, I guess here we have
pros and cons for both sides. It is true that there is an existing concept of
GUARDRAIL on Cassandra, and that
I think my suggestion was unclear. I was referring to the name guardrail,
using the same infra as guardrails, rather than a separate concept. Not
applying it like we do table options.
On Tue, Jun 25, 2024 at 12:44 AM Bernardo Botella <
conta...@bernardobotella.com> wrote:
> Hi Ariel and Jon,
>
Hi Ariel and Jon,
Let me address your question first. Yes, AND is supported in the proposal.
Below you can find some examples of different constraints applied to the same
column.
As per the LENGTH name instead of sizeOf as in the proposal, I am also not
opposed to it if it is more consistent w
Hi,
I see a vote for this has been called. I should have provided more prompt
feedback sooner.
I am a strong +1 on adding column level constraints being a good thing to add.
I'm not too concerned about row/partition/table level constraints, but I would
like to change the syntax before I would
I love where this is going. I have one question , however. I think it would
be more consistent if these were table level guardrails. Is there anything
that prevents us from utilizing the same underlying system and terminology
for both the node level guardrails and the table ones?
If we can avoid
To your point about Guardrails vs. Constraints, I do think the distinct roles
of “cluster operator” and “application developer” help show how these two
frameworks are both valuable. I don’t think I’d expect a cluster operator to be
involved in every table design decision, but being able to set w
Thanks for the comments Jordan.
Completely agreed that we will need to be careful on not accepting constraints
that require a read before a write. It is called out on the CEP itself, and
will have to be enforced in the future.
After all the feedback and discussion, I think we are ready to move
I am generally for this CEP, particularly the sizeOf guardrail. For
example, we recently had an incident caused by a client who wrote outside
of the contract we had verbally established. The constraint would have let
us encode that contract into the database. In this case, clients are
writing large
Thanks a lot for your comments Abe!
I do agree that the Constraint clause should be as simple as possible. I will
add a note on the CEP along with some specifics about the proposed constraints
(removing the ones that are contentious, and adding them to a possible future
additions section). And
I've thought about this some more. It would be useful for Cassandra to support
user-defined "guardrails" (or constraints, whatever you want to call them),
that could be applied per keyspace or table. Whether a user or an operator is
considered the owner of a table depends on the organization dep
I think having JSON validation on existing text fields is a pretty
reasonable idea, regardless if we have a JSON type or not. I could see
folks wanting to add a JSON constraint to an existing text field, for
example.
I like the idea of a postgres-style JSONB type, but I don't want to derail
this
Hey Bernardo,
Thanks for the proposal and putting together your summary of the discussion. A
few thoughts:
I'm not completely convinced of the value of CONSTRAINTS for a database like
Cassandra, which doesn't support any referential integrity checks, doesn't do
read-before-write for all querie
Hi again,
I completely agree that anything beyond simple poses a problem. My point is
that the definition of simple may vary, and each of those constraints I
mentioned deserves a conversation on its own. As I previously mentioned on the
dev thread:
https://lists.apache.org/thread/qln8cbkhlw9j95
My gut feeling is that anything beyond simple comparisons is just too
problematic / complex. I think that this should be part of the application
logic rather than putting that to the database. Is there any major database
out there which has constraints modelled like that? (belongsToEnum,
isNotBlock
>
> 2)
> Is part of an enum is somehow suplying the lack of enum types. Constraint
> could be something like CONSTRAINT belongsToEnum([list of valid values],
> field):
> CREATE TABLE keyspace.table (
> field text CONSTRAINT belongsToEnum(['foo', 'foo2'], field),
> ...
> );
> 3)
> Similarly, we
Hi Štephan
I'll address the different points:
1)
An example (possibly a stretch) of use case for != constraint would be:
Let's say you have a table in which you want to record a movement, from
position p1 to position p2. You may want to check that those two are different
to make sure there is ac
Hi Bernardo,
1) Could you elaborate on these two constraints?
== and != ?
What is the use case? Why would I want to have data in a database stored in
some column which would need to be _same as my constraint_ and which _could
not_ be same as my constraint? Can you give me at least one example of
Hi everyone,
After the feedback, I'd like to make a recap of what we have discussed in this
thread and try to move forward with the conversation.
I made some clarifications:
- Constraints are only applied at write time.
- Guardrail configurations should maintain preference over what's being defi
How I see it is that in 5.1 there will be TCM for the very first time and I
do not think that config in TCM would make it into 5.1 based on what Sam
talks about (need for some stability etc), that makes total sense to me.
TCM is quite a big feature to deliver on its own and putting even way more
st
We've been working on a draft CEP for migrating config from yaml to cluster
metadata but have been a bit short of time recently, I'll try to get something
out for discussion as soon as possible.
A little delay isn't such a bad thing IMO, as we're still ironing out the kinks
in the TCM implement
Yes, all configuration should be transactional (configuration which makes
sense to require to be the same cluster-wide). Guardrails in TCM are just a
subset of this problem. When I started to do CEP-24 I started with
guardrails in TCM but then I realized it leads to more general "all config
in TCM"
There’s a difference between the two though. Constraints are part of the table
schema, and (independent of the interaction with Guardrails), have no
dependency on yaml files being perfectly in sync across the cluster. Therefore,
the feature (Constraints) on its own doesn’t depend on configuratio
My concern about mentioning other potential constraints to be implemented in
the future on the CEP is it may derail the conversation from the set of initial
ones I want to propose, which are size and value constraints. There is
definitely a lot of other potential constraints that we could discus
On Thu, Jun 6, 2024 at 1:50 PM Bernardo Botella <
conta...@bernardobotella.com> wrote:
> I will update the CEP being specific with the two specific Constraint
> types I will be adding, which are size and value (the ones shown in the
> example).
>
Could you identify constraints for the most common
On Thu, Jun 6, 2024 at 1:03 PM Štefan Miklošovič <
stefan.mikloso...@gmail.com> wrote:
> It is interesting to see this feedback. When I look at CEP-24 where I am
> obsessing about a user being able to misconfigure the password validation
> strength so if a user hits a "weak" node then she would be
il on one node but
>>>>> you don’t set it (or set it differently) on the other? If it is
>>>>> configured differently and you want to check the guardrails if
>>>>> constraints do not violate them, then your query might fail or not based
>>>&
t;>>>>>> CONSTRAINT r >= 0,
>>>>>>> CONSTRAINT r < 255,
>>>>>>> CONSTRAINT g >= 0,
>>>>>>> CONSTRAINT g < 255,
>>>>>>> CONSTRAINT b >= 0,
>>>>>>> CONS
>>>> Another types of constraints and functions can be added in the future
>>>>>> to provide even more flexibility, but are out of the scope of this CEP.
>>>>>>
>>>>>> Bernardo
>>>>>>
>>>>>> On Jun 4, 2024, at 1:01 PM, Jon Haddad wrote
;>> Cc: Miklosovic, Stefan >>>>>>>> <mailto:stefan.mikloso...@netapp.com>>
>>>>>>>>> Subject: Re: [DISCUSS] CEP-42: Constraints Framework
>>>>>>>>>
>>>>>>>>> You don't often get email fr
the value they provide on their own.
>>>>>
>>>>> I think it would help a lot if we knew what types of constraints,
>>>>> besides the size check, you were thinking of adding.
>>>>>
>>>>> Jon
>>>>>
>&
, Jun 3, 2024 at 5:27 PM Bernardo Botella <
>>>> conta...@bernardobotella.com> wrote:
>>>>
>>>>> Yes, that is correct. This particular behavior will need CEP-24 in
>>>>> order to work reliably. But, if my understanding is correct, that
&g
ardrails, and not only for this particular
>>>> feature.
>>>>
>>>> On Jun 3, 2024, at 3:54 PM, Miklosovic, Stefan <
>>>> stefan.mikloso...@netapp.com> wrote:
>>>>
>>>> That would work reliably in case there is no way how to miscon
tapp.com> wrote:
>>>
>>> That would work reliably in case there is no way how to misconfigure
>>> guardrails in the cluster. What if you set a guardrail on one node but you
>>> don’t set it (or set it differently) on the other? If it is configured
>&g
rdrails if constraints do not
>> violate them, then your query might fail or not based on what node is hit.
>>
>> I guess that guardrails would need to start to be transactional to be
>> sure this is avoided and guardrails are indeed same everywhere (CEP-24
>> thr
t;>
>>>
>>> From: Bernardo Botella >> <mailto:conta...@bernardobotella.com>>
>>> Date: Tuesday, 4 June 2024 at 00:31
>>> To: dev@cassandra.apache.org <mailto:dev@cassandra.apache.org>
>>> mailto:dev@cassandr
, 4 June 2024 at 00:31
> *To: *dev@cassandra.apache.org
> *Cc: *Miklosovic, Stefan
> *Subject: *Re: [DISCUSS] CEP-42: Constraints Framework
> You don't often get email from conta...@bernardobotella.com. Learn why
> this is important <https://aka.ms/LearnAboutSenderIdentification
L).
>
>
> From: Bernardo Botella <mailto:conta...@bernardobotella.com>>
> Date: Tuesday, 4 June 2024 at 00:31
> To: dev@cassandra.apache.org <mailto:dev@cassandra.apache.org>
> mailto:dev@cassandra.apache.org>>
> Cc: Miklosovic, Stefan <mailto:stefan.miklos
00:31
To: dev@cassandra.apache.org
Cc: Miklosovic, Stefan
Subject: Re: [DISCUSS] CEP-42: Constraints Framework
You don't often get email from conta...@bernardobotella.com. Learn why this is
important<https://aka.ms/LearnAboutSenderIdentification>
EXTERNAL EMAIL - USE CAUTION when cli
is your motivation to do it like you suggested?
>
> From: Bernardo Botella <mailto:conta...@bernardobotella.com>>
> Date: Friday, 31 May 2024 at 23:24
> To: dev@cassandra.apache.org <mailto:dev@cassandra.apache.org>
> mailto:dev@cassandra.apache.org>>
> Subject: [DISC
@cassandra.apache.org
Subject: [DISCUSS] CEP-42: Constraints Framework
You don't often get email from conta...@bernardobotella.com. Learn why this is
important<https://aka.ms/LearnAboutSenderIdentification>
EXTERNAL EMAIL - USE CAUTION when clicking links or attachments
Hello everyone,
I a
Hi Jeff,
Thanks a lot for your comments.
At your first question "Would this be implemented solely in the write path?”,
the answer is yes. I think enforcing it at reads/compaction/repairs may pose
problems for cases in which an alter table is performed adding new or more
strict constraints to
Separately, when we discuss benefits of a proposal in a CEP, we should talk about what’s concrete and ignore the stuff that’s idealistic. Of these four points:This brings to the table several benefits and flexibility. Some examples:Cassandra operators have more control to reason about your data and
Would this be implemented solely in the write path? Or would you also try to enforce it in the read and sstable/compaction/repair paths as well? On May 31, 2024, at 23:24, Bernardo Botella wrote:Hello everyone,I am proposing this CEP:CEP-42: Constraints Framework - CASSANDRA - Apache Software Fo
Hello everyone,
I am proposing this CEP:
https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-42%3A+Constraints+Framework
And I’m looking for feedback from the community.
Thanks a lot!
Bernardo
56 matches
Mail list logo