Hi Štephan I'll address the different points: 1) An example (possibly a stretch) of use case for != constraint would be: Let's say you have a table in which you want to record a movement, from position p1 to position p2. You may want to check that those two are different to make sure there is actual movement.
CREATE TABLE keyspace.table ( p1 int, p2 int, ..., CONSTRAINT p1 != p2 ); For the case of ==, I agree that it is harder to come up with a valid use case, and I added it for completion. 2) Is part of an enum is somehow suplying the lack of enum types. Constraint could be something like CONSTRAINT belongsToEnum([list of valid values], field): CREATE TABLE keyspace.table ( field text CONSTRAINT belongsToEnum(['foo', 'foo2'], field), ... ); 3) Similarly, we can check and reject if a term is part of a list of blocked terms: CREATE TABLE keyspace.table ( field text CONSTRAINT isNotBlocked(['blocked_foo', 'blocked_foo2'], field), ... ); Please let me know if this helps, Bernardo > On Jun 11, 2024, at 6:29 AM, Štefan Miklošovič <stefan.mikloso...@gmail.com> > wrote: > > Hi Bernardo, > > 1) Could you elaborate on these two constraints? > > == and != ? > > What is the use case? Why would I want to have data in a database stored in > some column which would need to be _same as my constraint_ and which _could > not_ be same as my constraint? Can you give me at least one example of each? > It looks like I am going to put a constant into a database in case of ==, > wouldn't a static column be better? > > 2) For examples of text based types you mentioned: "is part of an enum" - how > would you enforce this in Cassandra? What enum do we have in CQL? > 3) What does "is it block listed" mean? > > In the meanwhile, I made changes to CEP-24 to move transactionality into > optional features. > > On Tue, Jun 11, 2024 at 12:18 AM Bernardo Botella > <conta...@bernardobotella.com <mailto:conta...@bernardobotella.com>> wrote: >> Hi everyone, >> >> After the feedback, I'd like to make a recap of what we have discussed in >> this thread and try to move forward with the conversation. >> >> I made some clarifications: >> - Constraints are only applied at write time. >> - Guardrail configurations should maintain preference over what's being >> defined as a constraint. >> >> Specify constraints: >> There is a general feedback around adding more concrete examples than the >> ones that can be found on the CEP document. >> Basically, the initial constraints I am proposing are: >> - SizeOf Constraint for String types, as in >> name text CONSTRAINT sizeOf(name) < 256 >> >> - Value Constraint for numeric types >> number_of_items int CONSTRAINT number_of_items < 1000 >> >> Those two alone and combined provide a lot of flexibility, and allow complex >> validations that enable "new types" such as: >> >> CREATE TYPE keyspace.cidr_address_ipv4 ( >> ip_adress inet, >> subnet_mask int, >> CONSTRAINT subnet_mask > 0, >> CONSTRAINT subnet_mask < 32 >> ) >> >> CREATE TYPE keyspace.color ( >> r int, >> g int, >> b int, >> CONSTRAINT r >= 0, >> CONSTRAINT r < 255, >> CONSTRAINT g >= 0, >> CONSTRAINT g < 255, >> CONSTRAINT b >= 0, >> CONSTRAINT b < 255, >> ) >> >> >> Those two initial Constraints are de fundamental constraints that would give >> value to the feature. The framework can (and will) be extended with other >> Constraints, leaving us with the following: >> >> For numeric types: >> - Max (<) >> - Min (>) >> - Equality ( = = ) >> - Difference (!=) >> >> For date types: >> - Before (<) >> - After (>) >> >> For text based types: >> - Size (sizeOf) >> - isJson (is the text a json?) >> - complies with a given pattern >> - Is it block listed? >> - Is it part of an enum? >> >> General table constraints (including more than one column): >> - Compare between numeric types (a < b, a > b, a != b, …) >> - Compare between date types (date1 < date2, date1>date2, date1!=date2, …) >> >> I have updated the CEP with this information. >> >> Potential dependency on CEP-24: >> Giving that the Constraints Framework provides a set of checks to be >> performed along side those that can be made using the Guardrails framework, >> there may be some relation with CEP-24, which mentions transactional >> Guardrails to prevent situation in which the limit configurations are >> different across the cluster. >> >> This CEP-42 is not proposing modifying the Guardrails framework, and >> therefore should not be affected by CEP-24. It is true that the improvements >> provided by CEP-24 would benefit this Constraints framework, but it is not >> dependent on them. >> >> >> I hope I included all the points and addressed them on the CEP, otherwise, >> please call it out and I’ll be more than happy to include it. >> >> Thanks everyone for all the inputs! >> Bernardo >> >>> On Jun 7, 2024, at 11:54 AM, Štefan Miklošovič <stefan.mikloso...@gmail.com >>> <mailto:stefan.mikloso...@gmail.com>> wrote: >>> >>> How I see it is that in 5.1 there will be TCM for the very first time and I >>> do not think that config in TCM would make it into 5.1 based on what Sam >>> talks about (need for some stability etc), that makes total sense to me. >>> TCM is quite a big feature to deliver on its own and putting even way more >>> stuff into that might be detrimental to the quality if we rush it. >>> >>> Then sometimes after 5.1 we might take a serious look for config in TCM >>> itself. >>> >>> My plan, ideally, is to still ship CEP-24 without config in TCM, then after >>> 5.1 when config in TCM lands, CEP-24 might integrate with that on a deeper >>> level. >>> >>> If CEP-42 (this one) makes it into 5.1 as well, I think the similar case >>> might be done about that as well (integration with guardrails). >>> >>> On Fri, Jun 7, 2024 at 8:49 PM Sam Tunnicliffe <s...@beobal.com >>> <mailto:s...@beobal.com>> wrote: >>>> We've been working on a draft CEP for migrating config from yaml to >>>> cluster metadata but have been a bit short of time recently, I'll try to >>>> get something out for discussion as soon as possible. >>>> A little delay isn't such a bad thing IMO, as we're still ironing out the >>>> kinks in the TCM implementation itself. It'd be good to get a bit more >>>> road testing done with that before we start adding more to it, which I'm >>>> sure will start to ramp up once 5.0 is out. >>>> >>>> Thanks, >>>> Sam >>>> >>>>> On 7 Jun 2024, at 19:19, Štefan Miklošovič <stefan.mikloso...@gmail.com >>>>> <mailto:stefan.mikloso...@gmail.com>> wrote: >>>>> >>>>> Yes, all configuration should be transactional (configuration which makes >>>>> sense to require to be the same cluster-wide). Guardrails in TCM are just >>>>> a subset of this problem. When I started to do CEP-24 I started with >>>>> guardrails in TCM but then I realized it leads to more general "all >>>>> config in TCM" and I found myself rabbit-hole-ing endlessly. >>>>> >>>>> BTW I do not think that once CEP-24 is in place without guardrails in TCM >>>>> then implementing it would blow up things a lot. It is really just about >>>>> a couple mutable virtual tables and a couple transformations for various >>>>> guardrail types we have but I expect that its integration into more >>>>> general config in TCM should be rather straightforward. >>>>> >>>>> Config in TCM definitely deserves its own CEP, it is too much to handle >>>>> under CEP-24 and CEP-24 can go without it already. It just put a little >>>>> bit more configuration acumen to nail it down correctly. >>>>> >>>>> Regards >>>>> >>>>> On Fri, Jun 7, 2024 at 8:12 PM Doug Rohrer <droh...@apple.com >>>>> <mailto:droh...@apple.com>> wrote: >>>>>> There’s a difference between the two though. Constraints are part of the >>>>>> table schema, and (independent of the interaction with Guardrails), have >>>>>> no dependency on yaml files being perfectly in sync across the cluster. >>>>>> Therefore, the feature (Constraints) on its own doesn’t depend on >>>>>> configuration files to be correct in its own right. The only place where >>>>>> this isn’t true is it’s interaction with Guardrails, which happen to be >>>>>> yaml-file based and cause issues. >>>>>> >>>>>> CEP-24’s password length requirements, however, is intended to be >>>>>> implemented by adding a new guardrail, which is totally dependent on >>>>>> YAML files today (and thus the concerns around a single misconfigured >>>>>> server allowing someone to use an insecure password). If CEP-24 fixes >>>>>> guardrails’ dependence on yaml files, it would also fix the problematic >>>>>> interaction between guardrails and constraints. >>>>>> >>>>>> I agree that it would be incredibly valuable to find a solution to the >>>>>> “yaml files need to be correct everywhere or something breaks” problem, >>>>>> and I think CEP-24, being security-focused, is more likely to be >>>>>> problematic without a solution to this issue. That said, I think Dinesh >>>>>> is right in that, at the end of the day, CEP-24 could be implemented >>>>>> without fixing the yaml config issue. >>>>>> >>>>>> I do wonder if the “Guardrails should be transactional” should really be >>>>>> “configuration should be transactional”, or at least as much config as >>>>>> possible should be, but that would blow up CEP-24 fairly dramatically >>>>>> (maybe?). Maybe “cluster-wide configuration should be read from a >>>>>> distributed source on startup/joining the cluster” or something would >>>>>> make sense, so the yaml file works as the source of truth on startup, >>>>>> but as soon as possible it’s read from a TCM-backed data source, and >>>>>> anything the node can get from other nodes it would… but now I’m >>>>>> designing a different CEP in a discuss thread, which is probably a bad >>>>>> idea... >>>>>> >>>>>> Regardless, I hope that I’m explaining why I see a difference between >>>>>> constraints and guardrails, and why I think it makes sense that >>>>>> constraints can move forward without a solution the misconfiguration >>>>>> problem where I also think you were right in calling it out in CEP-24 >>>>>> (even if we eventually move forward on CEP-24 without the solution in >>>>>> place). >>>>>> >>>>>> Doug >>>>>> >>>>>> >>>>>> >>>>>>> On Jun 7, 2024, at 1:51 AM, Dinesh Joshi <djo...@apache.org >>>>>>> <mailto:djo...@apache.org>> wrote: >>>>>>> >>>>>>> On Thu, Jun 6, 2024 at 1:03 PM Štefan Miklošovič >>>>>>> <stefan.mikloso...@gmail.com <mailto:stefan.mikloso...@gmail.com>> >>>>>>> wrote: >>>>>>>> It is interesting to see this feedback. When I look at CEP-24 where I >>>>>>>> am obsessing about a user being able to misconfigure the password >>>>>>>> validation strength so if a user hits a "weak" node then she would be >>>>>>>> able to bypass it, and I see what is our approach here, then I am not >>>>>>>> sure what I was waiting so long for and I should probably be just more >>>>>>>> aggressive with the CEP and all the "caveats" could be just overlooked >>>>>>>> and deferred to "sometimes later". >>>>>>> >>>>>>> Stefan, unfortunately I didn't participate in the CEP-24 DISCUSS >>>>>>> thread. Had I paid attention I would have suggested waiting on TCM >>>>>>> doesn't make the feature any different. The feature is less likely to >>>>>>> be misconfigured in a cluster. CEP-24 is valuable and password >>>>>>> compliance with policies is a super useful feature which IMO shouldn't >>>>>>> have been held back due to lack of TCM. >>>>>>> >>>>>> >>>> >>