Re: [DISCUSS] CEP-42: Constraints Framework

2024-07-01 Thread Bernardo Botella
Thanks everyone for all the feedback that came in after the call for votes. To Yifan's point, yes you are right, and I updated the CEP with the expressions. There’s been a really good discussion around adding or supporting constraints at read time. I think the point Doug made illustrate that suc

Re: [DISCUSS] CEP-42: Constraints Framework

2024-06-29 Thread Dinesh Joshi
The read time constraint application is going to be expensive and possibly complicated to implement with low RoI. Therefore my suggestion is to defer it. If there are situations where it appears to be helpful, we can always reconsider it. On Tue, Jun 25, 2024 at 3:34 PM Yifan Cai wrote: > - Alte

Re: [DISCUSS] CEP-42: Constraints Framework

2024-06-25 Thread Yifan Cai
> > - Alter and Drop constraints are as follows > ALTER CONSTRAINT [name] CHECK new_condition DROP CONSTRAINT [name] > I think you mean the following syntax to modify existing constraints, since constraints are part of the table definition. ALTER TABLE [keyspace_name.]table_name ALTER CONSTRAINT [

Re: [DISCUSS] CEP-42: Constraints Framework

2024-06-25 Thread Štefan Miklošovič
I wonder how often it is that users will apply the constraints on tables with data while they know their data is probably not compliant with the constraint configuration. I humbly think that people are aware of this in advance and what usually happens is that there is some kind of a job which conso

Re: [DISCUSS] CEP-42: Constraints Framework

2024-06-25 Thread Dinesh Joshi
Abe, that's a good point. We need to call out distinct use-cases here. When a fresh cluster is set up with constraints we don't have any issues because the data written and read back is going to be compliant to the constraint(s). For existing data in a cluster where new constraints are applied or e

Re: [DISCUSS] CEP-42: Constraints Framework

2024-06-25 Thread Doug Rohrer
On the Analytics side, as long as the CQLSSTableWriter understands and enforces the constraints (which it should be able to , given we provide the table schema) we should be good to go. We should try hard to avoid scanning the data on import, as the Analytics library does a bunch of things to pu

Re: [DISCUSS] CEP-42: Constraints Framework

2024-06-25 Thread Abe Ratnofsky
If we're going to introduce a feature that looks like SQL constraints, we should make sure it's "reasonably" compliant. In particular, we should avoid situations where a user creates a constraint, writes some data, then reads data that violates that constraint, unless they've expressed that viol

Re: [DISCUSS] CEP-42: Constraints Framework

2024-06-25 Thread Dinesh Joshi
On Tue, Jun 25, 2024 at 10:59 AM Josh McKenzie wrote: > > My intuition is the vote got called a *smidge* early but that things are > very much moving in the right direction and are very close. > Agreed and the vote thread got us more feedback which is valuable :)

Re: [DISCUSS] CEP-42: Constraints Framework

2024-06-25 Thread Josh McKenzie
> I was referring to the name guardrail, using the same infra as guardrails Curious if there's a subtle distinction implicit in this (or just in my brain...). A guardrail is something one person puts in place for someone else - in our case operators to users. Constraints are something inherent to

Re: [DISCUSS] CEP-42: Constraints Framework

2024-06-25 Thread Bernardo Botella
Hi Ariel, Your suggestions make sense, and I’ll be updating the CEP with the details. Basically: - We have an optional name for the constraints. If the name is not provided, a random name is generated for a constraint: CREATE TABLE keyspace.table ( p1 int, p2 int, ..., CONSTRAINT [name]

Re: [DISCUSS] CEP-42: Constraints Framework

2024-06-25 Thread Dinesh Joshi
+1 on Doug's suggestion. The operator sets a limit that application developers should not be allowed to violate. This is precisely the type of safety that we should strive for. To Jordan's point, I also agree that the read before write type of constraints should be avoided but if there is a very g

Re: [DISCUSS] CEP-42: Constraints Framework

2024-06-25 Thread Ariel Weisberg
Hi, I am also +1 on Doug's distinction between things that can be managed by operators and things that can be managed by applications. Some things to note about the syntax is that there are parens around the condition in SQL. In your example there are multiple anonymous constraints on the same

Re: [DISCUSS] CEP-42: Constraints Framework

2024-06-25 Thread Bernardo Botella
Got it. Thanks for the clarification Jon. Then, in terms of syntax, I think we can discard the option 2. In terms of GUARDRAIL vs CONSTRAINT concept you bring up, I guess here we have pros and cons for both sides. It is true that there is an existing concept of GUARDRAIL on Cassandra, and that

Re: [DISCUSS] CEP-42: Constraints Framework

2024-06-24 Thread Jon Haddad
I think my suggestion was unclear. I was referring to the name guardrail, using the same infra as guardrails, rather than a separate concept. Not applying it like we do table options. On Tue, Jun 25, 2024 at 12:44 AM Bernardo Botella < conta...@bernardobotella.com> wrote: > Hi Ariel and Jon, >

Re: [DISCUSS] CEP-42: Constraints Framework

2024-06-24 Thread Bernardo Botella
Hi Ariel and Jon, Let me address your question first. Yes, AND is supported in the proposal. Below you can find some examples of different constraints applied to the same column. As per the LENGTH name instead of sizeOf as in the proposal, I am also not opposed to it if it is more consistent w

Re: [DISCUSS] CEP-42: Constraints Framework

2024-06-24 Thread Ariel Weisberg
Hi, I see a vote for this has been called. I should have provided more prompt feedback sooner. I am a strong +1 on adding column level constraints being a good thing to add. I'm not too concerned about row/partition/table level constraints, but I would like to change the syntax before I would

Re: [DISCUSS] CEP-42: Constraints Framework

2024-06-24 Thread Jon Haddad
I love where this is going. I have one question , however. I think it would be more consistent if these were table level guardrails. Is there anything that prevents us from utilizing the same underlying system and terminology for both the node level guardrails and the table ones? If we can avoid

Re: [DISCUSS] CEP-42: Constraints Framework

2024-06-24 Thread Doug Rohrer
To your point about Guardrails vs. Constraints, I do think the distinct roles of “cluster operator” and “application developer” help show how these two frameworks are both valuable. I don’t think I’d expect a cluster operator to be involved in every table design decision, but being able to set w

Re: [DISCUSS] CEP-42: Constraints Framework

2024-06-24 Thread Bernardo Botella
Thanks for the comments Jordan. Completely agreed that we will need to be careful on not accepting constraints that require a read before a write. It is called out on the CEP itself, and will have to be enforced in the future. After all the feedback and discussion, I think we are ready to move

Re: [DISCUSS] CEP-42: Constraints Framework

2024-06-23 Thread Jordan West
I am generally for this CEP, particularly the sizeOf guardrail. For example, we recently had an incident caused by a client who wrote outside of the contract we had verbally established. The constraint would have let us encode that contract into the database. In this case, clients are writing large

Re: [DISCUSS] CEP-42: Constraints Framework

2024-06-13 Thread Bernardo Botella
Thanks a lot for your comments Abe! I do agree that the Constraint clause should be as simple as possible. I will add a note on the CEP along with some specifics about the proposed constraints (removing the ones that are contentious, and adding them to a possible future additions section). And

Re: [DISCUSS] CEP-42: Constraints Framework

2024-06-12 Thread Abe Ratnofsky
I've thought about this some more. It would be useful for Cassandra to support user-defined "guardrails" (or constraints, whatever you want to call them), that could be applied per keyspace or table. Whether a user or an operator is considered the owner of a table depends on the organization dep

Re: [DISCUSS] CEP-42: Constraints Framework

2024-06-12 Thread Jon Haddad
I think having JSON validation on existing text fields is a pretty reasonable idea, regardless if we have a JSON type or not. I could see folks wanting to add a JSON constraint to an existing text field, for example. I like the idea of a postgres-style JSONB type, but I don't want to derail this

Re: [DISCUSS] CEP-42: Constraints Framework

2024-06-12 Thread Abe Ratnofsky
Hey Bernardo, Thanks for the proposal and putting together your summary of the discussion. A few thoughts: I'm not completely convinced of the value of CONSTRAINTS for a database like Cassandra, which doesn't support any referential integrity checks, doesn't do read-before-write for all querie

Re: [DISCUSS] CEP-42: Constraints Framework

2024-06-12 Thread Bernardo Botella
Hi again, I completely agree that anything beyond simple poses a problem. My point is that the definition of simple may vary, and each of those constraints I mentioned deserves a conversation on its own. As I previously mentioned on the dev thread: https://lists.apache.org/thread/qln8cbkhlw9j95

Re: [DISCUSS] CEP-42: Constraints Framework

2024-06-12 Thread Štefan Miklošovič
My gut feeling is that anything beyond simple comparisons is just too problematic / complex. I think that this should be part of the application logic rather than putting that to the database. Is there any major database out there which has constraints modelled like that? (belongsToEnum, isNotBlock

Re: [DISCUSS] CEP-42: Constraints Framework

2024-06-12 Thread Claude Warren, Jr via dev
> > 2) > Is part of an enum is somehow suplying the lack of enum types. Constraint > could be something like CONSTRAINT belongsToEnum([list of valid values], > field): > CREATE TABLE keyspace.table ( > field text CONSTRAINT belongsToEnum(['foo', 'foo2'], field), > ... > ); > 3) > Similarly, we

Re: [DISCUSS] CEP-42: Constraints Framework

2024-06-11 Thread Bernardo Botella
Hi Štephan I'll address the different points: 1) An example (possibly a stretch) of use case for != constraint would be: Let's say you have a table in which you want to record a movement, from position p1 to position p2. You may want to check that those two are different to make sure there is ac

Re: [DISCUSS] CEP-42: Constraints Framework

2024-06-11 Thread Štefan Miklošovič
Hi Bernardo, 1) Could you elaborate on these two constraints? == and != ? What is the use case? Why would I want to have data in a database stored in some column which would need to be _same as my constraint_ and which _could not_ be same as my constraint? Can you give me at least one example of

Re: [DISCUSS] CEP-42: Constraints Framework

2024-06-10 Thread Bernardo Botella
Hi everyone, After the feedback, I'd like to make a recap of what we have discussed in this thread and try to move forward with the conversation. I made some clarifications: - Constraints are only applied at write time. - Guardrail configurations should maintain preference over what's being defi

Re: [DISCUSS] CEP-42: Constraints Framework

2024-06-07 Thread Štefan Miklošovič
How I see it is that in 5.1 there will be TCM for the very first time and I do not think that config in TCM would make it into 5.1 based on what Sam talks about (need for some stability etc), that makes total sense to me. TCM is quite a big feature to deliver on its own and putting even way more st

Re: [DISCUSS] CEP-42: Constraints Framework

2024-06-07 Thread Sam Tunnicliffe
We've been working on a draft CEP for migrating config from yaml to cluster metadata but have been a bit short of time recently, I'll try to get something out for discussion as soon as possible. A little delay isn't such a bad thing IMO, as we're still ironing out the kinks in the TCM implement

Re: [DISCUSS] CEP-42: Constraints Framework

2024-06-07 Thread Štefan Miklošovič
Yes, all configuration should be transactional (configuration which makes sense to require to be the same cluster-wide). Guardrails in TCM are just a subset of this problem. When I started to do CEP-24 I started with guardrails in TCM but then I realized it leads to more general "all config in TCM"

Re: [DISCUSS] CEP-42: Constraints Framework

2024-06-07 Thread Doug Rohrer
There’s a difference between the two though. Constraints are part of the table schema, and (independent of the interaction with Guardrails), have no dependency on yaml files being perfectly in sync across the cluster. Therefore, the feature (Constraints) on its own doesn’t depend on configuratio

Re: [DISCUSS] CEP-42: Constraints Framework

2024-06-07 Thread Bernardo Botella
My concern about mentioning other potential constraints to be implemented in the future on the CEP is it may derail the conversation from the set of initial ones I want to propose, which are size and value constraints. There is definitely a lot of other potential constraints that we could discus

Re: [DISCUSS] CEP-42: Constraints Framework

2024-06-06 Thread Dinesh Joshi
On Thu, Jun 6, 2024 at 1:50 PM Bernardo Botella < conta...@bernardobotella.com> wrote: > I will update the CEP being specific with the two specific Constraint > types I will be adding, which are size and value (the ones shown in the > example). > Could you identify constraints for the most common

Re: [DISCUSS] CEP-42: Constraints Framework

2024-06-06 Thread Dinesh Joshi
On Thu, Jun 6, 2024 at 1:03 PM Štefan Miklošovič < stefan.mikloso...@gmail.com> wrote: > It is interesting to see this feedback. When I look at CEP-24 where I am > obsessing about a user being able to misconfigure the password validation > strength so if a user hits a "weak" node then she would be

Re: [DISCUSS] CEP-42: Constraints Framework

2024-06-06 Thread Bernardo Botella
il on one node but >>>>> you don’t set it (or set it differently) on the other? If it is >>>>> configured differently and you want to check the guardrails if >>>>> constraints do not violate them, then your query might fail or not based >>>&

Re: [DISCUSS] CEP-42: Constraints Framework

2024-06-06 Thread Jon Haddad
t;>>>>>> CONSTRAINT r >= 0, >>>>>>> CONSTRAINT r < 255, >>>>>>> CONSTRAINT g >= 0, >>>>>>> CONSTRAINT g < 255, >>>>>>> CONSTRAINT b >= 0, >>>>>>> CONS

Re: [DISCUSS] CEP-42: Constraints Framework

2024-06-06 Thread Štefan Miklošovič
>>>> Another types of constraints and functions can be added in the future >>>>>> to provide even more flexibility, but are out of the scope of this CEP. >>>>>> >>>>>> Bernardo >>>>>> >>>>>> On Jun 4, 2024, at 1:01 PM, Jon Haddad wrote

Re: [DISCUSS] CEP-42: Constraints Framework

2024-06-06 Thread Doug Rohrer
;>> Cc: Miklosovic, Stefan >>>>>>>> <mailto:stefan.mikloso...@netapp.com>> >>>>>>>>> Subject: Re: [DISCUSS] CEP-42: Constraints Framework >>>>>>>>> >>>>>>>>> You don't often get email fr

Re: [DISCUSS] CEP-42: Constraints Framework

2024-06-06 Thread Štefan Miklošovič
the value they provide on their own. >>>>> >>>>> I think it would help a lot if we knew what types of constraints, >>>>> besides the size check, you were thinking of adding. >>>>> >>>>> Jon >>>>> >&

Re: [DISCUSS] CEP-42: Constraints Framework

2024-06-06 Thread Yifan Cai
, Jun 3, 2024 at 5:27 PM Bernardo Botella < >>>> conta...@bernardobotella.com> wrote: >>>> >>>>> Yes, that is correct. This particular behavior will need CEP-24 in >>>>> order to work reliably. But, if my understanding is correct, that &g

Re: [DISCUSS] CEP-42: Constraints Framework

2024-06-06 Thread Štefan Miklošovič
ardrails, and not only for this particular >>>> feature. >>>> >>>> On Jun 3, 2024, at 3:54 PM, Miklosovic, Stefan < >>>> stefan.mikloso...@netapp.com> wrote: >>>> >>>> That would work reliably in case there is no way how to miscon

Re: [DISCUSS] CEP-42: Constraints Framework

2024-06-06 Thread Štefan Miklošovič
tapp.com> wrote: >>> >>> That would work reliably in case there is no way how to misconfigure >>> guardrails in the cluster. What if you set a guardrail on one node but you >>> don’t set it (or set it differently) on the other? If it is configured >&g

Re: [DISCUSS] CEP-42: Constraints Framework

2024-06-05 Thread Jon Haddad
rdrails if constraints do not >> violate them, then your query might fail or not based on what node is hit. >> >> I guess that guardrails would need to start to be transactional to be >> sure this is avoided and guardrails are indeed same everywhere (CEP-24 >> thr

Re: [DISCUSS] CEP-42: Constraints Framework

2024-06-04 Thread Bernardo Botella
t;> >>> >>> From: Bernardo Botella >> <mailto:conta...@bernardobotella.com>> >>> Date: Tuesday, 4 June 2024 at 00:31 >>> To: dev@cassandra.apache.org <mailto:dev@cassandra.apache.org> >>> mailto:dev@cassandr

Re: [DISCUSS] CEP-42: Constraints Framework

2024-06-04 Thread Jon Haddad
, 4 June 2024 at 00:31 > *To: *dev@cassandra.apache.org > *Cc: *Miklosovic, Stefan > *Subject: *Re: [DISCUSS] CEP-42: Constraints Framework > You don't often get email from conta...@bernardobotella.com. Learn why > this is important <https://aka.ms/LearnAboutSenderIdentification

Re: [DISCUSS] CEP-42: Constraints Framework

2024-06-03 Thread Bernardo Botella
L). > > > From: Bernardo Botella <mailto:conta...@bernardobotella.com>> > Date: Tuesday, 4 June 2024 at 00:31 > To: dev@cassandra.apache.org <mailto:dev@cassandra.apache.org> > mailto:dev@cassandra.apache.org>> > Cc: Miklosovic, Stefan <mailto:stefan.miklos

Re: [DISCUSS] CEP-42: Constraints Framework

2024-06-03 Thread Miklosovic, Stefan via dev
00:31 To: dev@cassandra.apache.org Cc: Miklosovic, Stefan Subject: Re: [DISCUSS] CEP-42: Constraints Framework You don't often get email from conta...@bernardobotella.com. Learn why this is important<https://aka.ms/LearnAboutSenderIdentification> EXTERNAL EMAIL - USE CAUTION when cli

Re: [DISCUSS] CEP-42: Constraints Framework

2024-06-03 Thread Bernardo Botella
is your motivation to do it like you suggested? > > From: Bernardo Botella <mailto:conta...@bernardobotella.com>> > Date: Friday, 31 May 2024 at 23:24 > To: dev@cassandra.apache.org <mailto:dev@cassandra.apache.org> > mailto:dev@cassandra.apache.org>> > Subject: [DISC

Re: [DISCUSS] CEP-42: Constraints Framework

2024-06-03 Thread Miklosovic, Stefan via dev
@cassandra.apache.org Subject: [DISCUSS] CEP-42: Constraints Framework You don't often get email from conta...@bernardobotella.com. Learn why this is important<https://aka.ms/LearnAboutSenderIdentification> EXTERNAL EMAIL - USE CAUTION when clicking links or attachments Hello everyone, I a

Re: [DISCUSS] CEP-42: Constraints Framework

2024-06-02 Thread Bernardo Botella
Hi Jeff, Thanks a lot for your comments. At your first question "Would this be implemented solely in the write path?”, the answer is yes. I think enforcing it at reads/compaction/repairs may pose problems for cases in which an alter table is performed adding new or more strict constraints to

Re: [DISCUSS] CEP-42: Constraints Framework

2024-06-02 Thread Jeff Jirsa
Separately, when we discuss benefits of a proposal in a CEP, we should talk about what’s concrete and ignore the stuff that’s idealistic. Of these four points:This brings to the table several benefits and flexibility. Some examples:Cassandra operators have more control to reason about your data and

Re: [DISCUSS] CEP-42: Constraints Framework

2024-06-02 Thread Jeff Jirsa
Would this be implemented solely in the write path? Or would you also try to enforce it in the read and sstable/compaction/repair paths as well?  On May 31, 2024, at 23:24, Bernardo Botella wrote:Hello everyone,I am proposing this CEP:CEP-42: Constraints Framework - CASSANDRA - Apache Software Fo

[DISCUSS] CEP-42: Constraints Framework

2024-05-31 Thread Bernardo Botella
Hello everyone, I am proposing this CEP: https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-42%3A+Constraints+Framework And I’m looking for feedback from the community. Thanks a lot! Bernardo