Just updated the Jira <https://issues.apache.org/jira/browse/CASSANDRA-17719> to reflect the latest conversation here.
On Mon, Aug 15, 2022 at 1:06 PM Patrick McFadin <pmcfa...@gmail.com> wrote: > I am +1 on > > IS NOT NULL/IS NULL instead of EXISTS/NOT EXISTS > > Not requiring (but allowing) SELECT on LET > > Patrick > > On Mon, Aug 15, 2022 at 11:01 AM Caleb Rackliffe <calebrackli...@gmail.com> > wrote: > >> Monday Morning Caleb has digested, and here's where I am... >> >> 1.) I have no problem w/ having SELECT on the RHS of a LET assignment, >> and to be honest, this may make some implementation things easier for me >> (i.e. the encapsulation of SELECT within LET) >> 2.) I'm in favor of LET without a select, although I have no strong >> feeling that it needs to be in v1. >> 3.) I like Benedict's tuple deconstruction idea, as it restores some of >> the notational convenience of the previous proposal. Again, though, I don't >> have a strong feeling this needs to be in v1. >> 3.b.) When we do implement tuple deconstruction, I'd be in favor of >> supporting a single level of deconstruction to begin with. >> >> Having said all that, on Friday I finished a prototype (based on some of >> Blake's previous work) of the syntax/grammar we've more or less agreed upon >> here, including an implementation of what I described as option #5 above: >> https://github.com/maedhroz/cassandra/commits/CASSANDRA-17719-prototype >> >> To look at specific examples, see these tests: >> https://github.com/maedhroz/cassandra/blob/CASSANDRA-17719-prototype/test/distributed/org/apache/cassandra/distributed/test/accord/AccordIntegrationTest.java >> >> There are only two things that aren't yet congruent w/ our discussion >> above, but they should both be trivial to fix: >> >> 1.) I'm still using EXISTS/NOT EXISTS instead of IS NOT NULL/IS NULL. >> 2.) I don't require SELECT on the RHS of LET yet. >> >> If I were to just fix those two items, would we be in agreement on this >> being both the core of the syntax we want and compatible w/ the wish list >> for future items? >> >> >> On Sun, Aug 14, 2022 at 12:25 PM Benedict Elliott Smith < >> bened...@apache.org> wrote: >> >>> >>> >>> >>> Verbose version: >>> LET (a) = SELECT val FROM table >>> IF a > 1 THEN... >>> >>> Less verbose version: >>> LET a = SELECT val FROM table >>> IF a.val > 1 THEN... >>> >>> >>> >>> My intention is that these are actually two different ways of expressing >>> the same thing, both supported and neither intended to be more or less >>> verbose than the other. The advantage of permitting both is that you can >>> also write >>> >>> LET a = SELECT val FROM table >>> IF a IS NOT NULL AND a.val IS NULL THEN … >>> >>> Alternatively, for non-queries: >>> LET x = SELECT someFunc() AS v1, someOtherFunc() AS v2 >>> or less verbose: >>> LET x = (someFunc() AS v1, someOtherFunc() as v2) >>> LET (v1, v2) = (someFunc(), someOtherFunc()) >>> >>> >>> I personally prefer clarity over any arbitrary verbosity/succinct >>> distinction, but we’re in general “taste” territory here. Since this syntax >>> includes the SELECT on the RHS, it makes sense to only require this for >>> situations where a query is being performed. Though I think if SELECT >>> without a FROM is supported then we will likely end up supporting *all >>> of the above*. >>> >>> Weighing in on the "SELECT without a FROM," I think that is fine and, as >>> Avi stated >>> >>> >>> Yep, definitely fine. Question is just whether we bother to offer it. >>> Also, evidently, whether we support LET *without* a SELECT on the RHS. >>> I am strongly in favour of this, as *requiring* a SELECT even when >>> there’s no table involved is counter-intuitive to me, as LET is now a >>> distinct concept that looks like variable declaration in other languages. >>> >>> Nested: >>> LET (x, y) = SELECT x, y FROM… >>> >>> >>> Deconstruction here refers to the above, i.e. extracting variables x and >>> y from the tuple on the RHS >>> >>> Nesting is just a question of whether we support either nested tuple >>> declarations, or nested deconstruction, which might include any of the >>> following: >>> >>> LET (x, (y, z)) = SELECT (x, (y, z)) FROM… >>> LET (x, (y, z)) = SELECT x, someTuple FROM… >>> LET (x, (y, z)) = (SELECT x FROM.., SELECT y, x FROM…)) >>> LET (x, (y, z)) = (someFunc(), SELECT y, z FROM…) >>> LET (x, yAndZ) = (someFunc(), SELECT y, z FROM…) >>> >>> IMO, once you start supporting features they need to be sort of >>> intuitively discoverable by users, so that a concept can be used in all >>> places you might expect. >>> >>> But I would be fine with an arbitrary restriction of at most one SELECT >>> on the RHS, or even ONLY a SELECT *or* some other tuple, and at most >>> one level of deconstruction of the RHS. >>> >>> >>> >>> >>> >>> On 14 Aug 2022, at 18:04, Patrick McFadin <pmcfa...@gmail.com> wrote: >>> >>> Let me just state my bias right up front. For any kind of QL I lean >>> heavily toward verbose and explicit based on their lifecycle. A CQL query >>> will probably need to be understood by the next person looking at it, and a >>> few seconds saved typing isn't worth the potential misunderstanding later. >>> My opinion is formed by having to be the second person many times. :D >>> >>> I just want to make sure I have the syntax you are proposing. >>> >>> Verbose version: >>> LET (a) = SELECT val FROM table >>> IF a > 1 THEN... >>> >>> Less verbose version: >>> LET a = SELECT val FROM table >>> IF a.val > 1 THEN... >>> >>> Alternatively, for non-queries: >>> LET x = SELECT someFunc() AS v1, someOtherFunc() AS v2 >>> or less verbose: >>> LET x = (someFunc() AS v1, someOtherFunc() as v2) >>> LET (v1, v2) = (someFunc(), someOtherFunc()) >>> >>> Weighing in on the "SELECT without a FROM," I think that is fine and, as >>> Avi stated, already present in the SQL world. I would prefer that over >>> 'SELECT func() FROM dual;' (Looking at you, Oracle) >>> >>> Finally, on the topic of deconstructing SELECT statements instead of >>> nesting. If I understand the argument here, I would favor deconstructing >>> over nesting if there is a choice. I think this is what that choice would >>> look like. >>> >>> Deconstructed: >>> LET x = SELECT x FROM ... >>> LET y = SELECT y FROM ... >>> >>> Nested: >>> LET (x, y) = ((SELECT x FROM…), (SELECT y FROM)) >>> >>> I'm trying to summate but let me know if I missed something. I apologize >>> in advance to Monday morning Caleb, who will have to digest this thread. >>> >>> Patrick >>> >>> On Sun, Aug 14, 2022 at 9:00 AM Benedict Elliott Smith < >>> bened...@apache.org> wrote: >>> >>>> >>>> >>>> I think SQL dialects require subqueries to be parenthesized (not sure). >>>> If that's the case I think we should keep the tradition. >>>> >>>> >>>> This isn’t a sub-query though, since LET is not a query. If we permit >>>> at most one SELECT, and do not permit mixing SELECT with constant >>>> assignments, I don’t see why we would require parentheses. >>>> >>>> I see no harm in making FROM optional, as it's recognized by other SQL >>>> dialects. >>>> Absolutely, this just flows naturally from having tuples. There's no >>>> difference between "SELECT (a, b)" and "SELECT a_but_a_is_a_tuple”. >>>> >>>> >>>> Neither of these things are supported today, and they’re no longer >>>> necessary with this syntax proposal. The downside of splitting SELECT and >>>> LET is that there’s no impetus to improve the former. So the question was >>>> really whether we bother to improve it anyway, not whether or not they >>>> would be good improvements (I think they obviously are). >>>> >>>> I think this can be safely deferred. Most people would again separate >>>> it into separate LETs. >>>> >>>> That implies we’ll permit deconstructing a tuple variable in a LET. >>>> This makes sense to me, but is roughly equivalent to nested deconstruction. >>>> It might be that v1 we only support deconstructing SELECT statements, but I >>>> guess all of this is probably up to the implementor. >>>> >>>> I'd add (to the specification) that LETs cannot override a previously >>>> defined variable, just to reduce ambiguity. >>>> >>>> Yep, this was already agreed way back with the earlier proposal. >>>> >>>> >>>> On 14 Aug 2022, at 16:30, Avi Kivity <a...@scylladb.com> wrote: >>>> >>>> >>>> On 14/08/2022 17.50, Benedict Elliott Smith wrote: >>>> >>>> >>>> > SELECT and LET incompatible once comparisons become valid selectors >>>> >>>> I don’t think this would be ambiguous, as = is required in the LET >>>> syntax as we have to bind the result to a variable name. >>>> >>>> But, I like the deconstructed tuple syntax improvement over “Option 6”. >>>> This would also seem to easily support assigning from non-query statements, >>>> such as LET (a, b) = (someFunc(), someOtherFunc(?)) >>>> >>>> I don’t think it is ideal to depend on relative position in the tuple >>>> for assigning results to a variable name, as it leaves more scope for >>>> errors. It would be nice to have a simple way to deconstruct safely. But, I >>>> think this proposal is good, and I’d be fine with it as an alternative if >>>> others concur. I agree that seeing the SELECT independently may be more >>>> easily recognisable to users. >>>> >>>> With this approach there remains the question of how we handle single >>>> column results. I’d be inclined to treat in the following way: >>>> >>>> LET (a) = SELECT val FROM table >>>> IF a > 1 THEN... >>>> >>>> LET a = SELECT val FROM table >>>> IF a.val > 1 THEN... >>>> >>>> >>>> I think SQL dialects require subqueries to be parenthesized (not sure). >>>> If that's the case I think we should keep the tradition. >>>> >>>> >>>> ---- >>>> There is also the question of whether we support SELECT without a FROM >>>> clause, e.g. >>>> LET x = SELECT someFunc() AS v1, someOtherFunc() AS v2 >>>> >>>> Or just LET (since they are no longer equivalent) >>>> e.g. >>>> LET x = (someFunc() AS v1, someOtherFunc() as v2) >>>> LET (v1, v2) = (someFunc(), someOtherFunc()) >>>> >>>> >>>> I see no harm in making FROM optional, as it's recognized by other SQL >>>> dialects. >>>> >>>> >>>> ---- >>>> Also since LET is only binding variables, is there any reason we >>>> shouldn’t support multiple SELECT assignments in a single LET?, e.g. >>>> LET (x, y) = ((SELECT x FROM…), (SELECT y FROM)) >>>> >>>> >>>> What if an inner select returns a tuple? Would y be a tuple? >>>> >>>> >>>> I think this is redundant and atypical enough to not be worth >>>> supporting. Most people would use separate LETs. >>>> >>>> >>>> ---- >>>> Also whether we support tuples in SELECT statements anyway, e.g. >>>> LET (tuple1, tuple2) = SELECT (a, b), (c, d) FROM.. >>>> IF tuple1.a > 1 AND tuple2.d > 1… >>>> >>>> >>>> Absolutely, this just flows naturally from having tuples. There's no >>>> difference between "SELECT (a, b)" and "SELECT a_but_a_is_a_tuple". >>>> >>>> >>>> >>>> ---- >>>> and whether we support nested deconstruction, e.g. >>>> LET (a, b, (c, d)) = SELECT a, b, someTuple FROM.. >>>> IF a > 1 AND d > 1… >>>> >>>> >>>> I think this can be safely deferred. Most people would again separate >>>> it into separate LETs. >>>> >>>> >>>> I'd add (to the specification) that LETs cannot override a previously >>>> defined variable, just to reduce ambiguity. >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> On 14 Aug 2022, at 13:55, Avi Kivity via dev <dev@cassandra.apache.org> >>>> <dev@cassandra.apache.org> wrote: >>>> >>>> >>>> On 14/08/2022 01.29, Benedict Elliott Smith wrote: >>>> >>>> >>>> I’ll do my best to express with my thinking, as well as how I would >>>> explain the feature to a user. >>>> >>>> My mental model for LET statements is that they are simply SELECT >>>> statements where the columns that are selected become variables accessible >>>> anywhere in the scope of the transaction. That is to say, you should be >>>> able to run something like s/LET/SELECT and s/([^=]+)=([^,]+)(,|$)/\2 AS >>>> \1\3/g on the columns of a LET statement and produce a valid SELECT >>>> statement, and vice versa. Both should perform identically. >>>> >>>> e.g. >>>> SELECT pk AS key, v AS value FROM table >>>> >>>> => >>>> LET key = pk, value = v FROM table >>>> >>>> >>>> "=" is a CQL/SQL operator. Cassandra doesn't support it yet, but SQL >>>> supports selecting comparisons: >>>> >>>> >>>> $ psql >>>> psql (14.3) >>>> Type "help" for help. >>>> >>>> avi=# SELECT 1 = 2, 3 = 3, NULL = NULL; >>>> ?column? | ?column? | ?column? >>>> ----------+----------+---------- >>>> f | t | >>>> (1 row) >>>> >>>> >>>> Using "=" as a syntactic element in LET would make SELECT and LET >>>> incompatible once comparisons become valid selectors. Unless they become >>>> mandatory (and then you'd write "LET q = a = b" if you wanted to select a >>>> comparison). >>>> >>>> >>>> I personally prefer the nested query syntax: >>>> >>>> >>>> LET (a, b, c) = (SELECT foo, bar, x+y FROM ...); >>>> >>>> >>>> So there aren't two similar-but-not-quite-the-same syntaxes. SELECT is >>>> immediately recognizable by everyone as a query, LET is not. >>>> >>>> >>>> >>>> Identical form, identical behaviour. Every statement should be directly >>>> translatable with some simple text manipulation. >>>> >>>> We can then make this more powerful for users by simply expanding >>>> SELECT statements, e.g. by permitting them to declare constants and tuples >>>> in the column results. In this scheme LET x = * is simply syntactic sugar >>>> for LET x = (pk, ck, field1, …) This scheme then supports options 2, 4 and >>>> 5 all at once, consistently alongside each other. >>>> >>>> Option 6 is in fact very similar, but is strictly less flexible for the >>>> user as they have no way to declare multiple scalar variables without >>>> scoping them inside a tuple. >>>> >>>> e.g. >>>> LET key = pk, value = v FROM table >>>> IF key > 1 AND value > 1 THEN... >>>> >>>> => >>>> LET row = SELECT pk AS key, v AS value FROM table >>>> IF row.key > 1 AND row.value > 1 THEN… >>>> >>>> However, both are expressible in the existing proposal, as if you >>>> prefer this naming scheme you can simply write >>>> >>>> LET row = (pk AS key, v AS value) FROM table >>>> IF row.key > 1 AND row.value > 1 THEN… >>>> >>>> With respect to auto converting single column results to a scalar, we >>>> do need a way for the user to say they care whether the row was null or the >>>> column. I think an implicit conversion here could be surprising. However we >>>> could implement tuple expressions anyway and let the user explicitly >>>> declare v as a tuple as Caleb has suggested for the existing proposal as >>>> well. >>>> >>>> Assigning constants or other values not selected from a table would >>>> also be a little clunky: >>>> >>>> LET v1 = someFunc(), v2 = someOtherFunc(?) >>>> IF v1 > 1 AND v2 > 1 THEN… >>>> >>>> => >>>> LET row = SELECT someFunc() AS v1, someOtherFunc(?) AS v2 >>>> IF row.v1 > 1 AND row.v2 > 1 THEN... >>>> >>>> That said, the proposals are *close* to identical, it is just slightly >>>> more verbose and slightly less flexible. >>>> >>>> Which one would be most intuitive to users is hard to predict. It might >>>> be that Option 6 would be slightly easier, but I’m unsure if there would be >>>> a huge difference. >>>> >>>> >>>> On 13 Aug 2022, at 16:59, Patrick McFadin <pmcfa...@gmail.com> wrote: >>>> >>>> I'm really happy to see CEP-15 getting closer to a final >>>> implementation. I'm going to walk through my reasoning for your proposals >>>> wrt trying to explain this to somebody new. >>>> >>>> Looking at all the options, the first thing that comes up for me is the >>>> Cassandra project's complicated relationship with NULL. We have prior art >>>> with EXISTS/NOT EXISTS when creating new tables. IS NULL/IS NOT NULL is >>>> used in materialized views similarly to proposals 2,4 and 5. >>>> >>>> CREATE MATERIALIZED VIEW [ IF NOT EXISTS ] [keyspace_name.]view_name >>>> AS SELECT [ (column_list) ] >>>> FROM [keyspace_name.]table_name >>>> [ WHERE column_name IS NOT NULL >>>> [ AND column_name IS NOT NULL ... ] ] >>>> [ AND relation [ AND ... ] ] >>>> PRIMARY KEY ( column_list ) >>>> [ WITH [ table_properties ] >>>> [ [ AND ] CLUSTERING ORDER BY (cluster_column_name order_option) ] ] ; >>>> >>>> Based on that, I believe 1 and 3 would just confuse users, so -1 on >>>> those. >>>> >>>> Trying to explain the difference between row and column operations with >>>> LET, I can't see the difference between a row and column in #2. >>>> >>>> #4 introduces a boolean instead of column names and just adds more >>>> syntax. >>>> >>>> #5 is verbose and, in my opinion, easier to reason when writing a >>>> query. Thinking top down, I need to know if these exact rows and/or column >>>> values exist before changing them, so I'll define them first. Then I'll >>>> iterate over the state I created in my actual changes so I know I'm >>>> changing precisely what I want. >>>> >>>> #5 could use a bit more to be clearer to somebody who doesn't write CQL >>>> queries daily and wouldn't require memorizing subtle differences. It should >>>> be similar to all the other syntax, so learning a little about CQL will let >>>> you move into more without completely re-learning the new syntax. >>>> >>>> So I propose #6) >>>> BEGIN TRANSACTION >>>> LET row1 = SELECT * FROM ks.tbl WHERE k=0 AND c=0; <-- * selects all >>>> columns >>>> LET row2 = SELECT v FROM ks.tbl WHERE k=1 AND c=0; >>>> SELECT row1, row2 >>>> IF row1 IS NULL AND row2.v = 3 THEN >>>> INSERT INTO ks.tbl (k, c, v) VALUES (0, 0, 1); >>>> END IF >>>> COMMIT TRANSACTION >>>> >>>> I added the SELECT in the LET just so it's straightforward, you are >>>> reading, and it's just like doing a regular select, but you are assigning >>>> it to a variable. >>>> >>>> I removed the confusing 'row1.v' and replaced it with 'row1' I can't >>>> see why you would need the '.v' vs having the complete variable I >>>> created in the statement above. >>>> >>>> EOL >>>> >>>> Patrick >>>> >>>> On Thu, Aug 11, 2022 at 1:37 PM Caleb Rackliffe < >>>> calebrackli...@gmail.com> wrote: >>>> >>>>> ...and one more option... >>>>> >>>>> 5.) Introduce tuple assignments, removing all ambiguity around row vs. >>>>> column operations. >>>>> >>>>> BEGIN TRANSACTION >>>>> LET row1 = * FROM ks.tbl WHERE k=0 AND c=0; <-- * selects all columns >>>>> LET row2 = (v) FROM ks.tbl WHERE k=1 AND c=0; >>>>> SELECT row1.v, row2.v >>>>> IF row1 IS NULL AND row2.v = 3 THEN >>>>> INSERT INTO ks.tbl (k, c, v) VALUES (0, 0, 1); >>>>> END IF >>>>> COMMIT TRANSACTION >>>>> >>>>> >>>>> >>>>> On Thu, Aug 11, 2022 at 12:55 PM Caleb Rackliffe < >>>>> calebrackli...@gmail.com> wrote: >>>>> >>>>>> via Benedict, here is a 4th option: >>>>>> >>>>>> 4.) Similar to #2, but don't rely on the key element being NULL. >>>>>> >>>>>> If the read returns no result, x effectively becomes NULL. Otherwise, >>>>>> it remains true/NOT NULL. >>>>>> >>>>>> BEGIN TRANSACTION >>>>>> LET x = true FROM ks.tbl WHERE k=0 AND c=0; >>>>>> LET row2_v = v FROM ks.tbl WHERE k=1 AND c=0; >>>>>> SELECT x, row2_v >>>>>> IF x IS NULL AND row2_v = 3 THEN >>>>>> INSERT INTO ks.tbl (k, c, v) VALUES (0, 0, 1); >>>>>> END IF >>>>>> COMMIT TRANSACTION >>>>>> >>>>>> On Thu, Aug 11, 2022 at 12:12 PM Caleb Rackliffe < >>>>>> calebrackli...@gmail.com> wrote: >>>>>> >>>>>>> Hello again everyone! >>>>>>> >>>>>>> I've been working on a prototype >>>>>>> <https://issues.apache.org/jira/browse/CASSANDRA-17719> in >>>>>>> CASSANDRA-17719 for a grammar that roughly corresponds to what we've >>>>>>> agreed >>>>>>> on in this thread. One thing that isn't immediately obvious to me is how >>>>>>> the LET syntax handles cases where we want to check for the plain >>>>>>> existence >>>>>>> of a row in IF. For example, in this hybrid of the originally proposed >>>>>>> syntax and something more like what we've agreed on (and the RETURNING >>>>>>> just >>>>>>> to distinguish between that and SELECT), this could be pretty >>>>>>> straightforward: >>>>>>> >>>>>>> BEGIN TRANSACTION >>>>>>> SELECT v FROM ks.tbl WHERE k=0 AND c=0 AS row1; >>>>>>> SELECT v FROM ks.tbl WHERE k=1 AND c=0 AS row2; >>>>>>> RETURNING row1.v, row2.v >>>>>>> IF row1 NOT EXISTS AND row2.v = 3 THEN >>>>>>> INSERT INTO ks.tbl (k, c, v) VALUES (0, 0, 1); >>>>>>> END IF >>>>>>> COMMIT TRANSACTION >>>>>>> >>>>>>> The NOT EXISTS operator has row1 to work with. One the other hand, >>>>>>> w/ the LET syntax and no naming of reads, it's not clear what the best >>>>>>> solution would be. Here are a few possibilities: >>>>>>> >>>>>>> 1.) Provide a few built-in functions that operate on a whole result >>>>>>> row. If we assume a SQL style IS NULL and IS NOT NULL (see my last post >>>>>>> here) for operations on particular columns, this probably eliminates the >>>>>>> need for EXISTS/NOT EXISTS as well. >>>>>>> >>>>>>> BEGIN TRANSACTION >>>>>>> LET row1_missing = notExists() FROM ks.tbl WHERE k=0 AND c=0; >>>>>>> LET row2_v = v FROM ks.tbl WHERE k=1 AND c=0; >>>>>>> SELECT row1_missing, row2_v >>>>>>> IF row1_missing AND row2_v = 3 THEN >>>>>>> INSERT INTO ks.tbl (k, c, v) VALUES (0, 0, 1); >>>>>>> END IF >>>>>>> COMMIT TRANSACTION >>>>>>> >>>>>>> 2.) Assign and check the first primary key element to determine >>>>>>> whether the row exists. >>>>>>> >>>>>>> BEGIN TRANSACTION >>>>>>> LET row1_k = k FROM ks.tbl WHERE k=0 AND c=0; >>>>>>> LET row2_v = v FROM ks.tbl WHERE k=1 AND c=0; >>>>>>> SELECT row1_k, row2_v >>>>>>> IF row1_k IS NULL AND row2_v = 3 THEN >>>>>>> INSERT INTO ks.tbl (k, c, v) VALUES (0, 0, 1); >>>>>>> END IF >>>>>>> COMMIT TRANSACTION >>>>>>> >>>>>>> 3.) Reconsider the LET concept toward something that allows us to >>>>>>> explicitly name our reads again. >>>>>>> >>>>>>> BEGIN TRANSACTION >>>>>>> WITH (SELECT v FROM ks.tbl WHERE k=0 AND c=0) AS row1; >>>>>>> WITH (SELECT v FROM ks.tbl WHERE k=1 AND c=0) AS row2; >>>>>>> SELECT row1.v, row2.v >>>>>>> IF row1 NOT EXISTS AND row2.v = 3 THEN >>>>>>> INSERT INTO ks.tbl (k, c, v) VALUES (0, 0, 1); >>>>>>> END IF >>>>>>> COMMIT TRANSACTION >>>>>>> >>>>>>> I don't have a strong affinity for any of these, although #1 seems >>>>>>> the most awkward. >>>>>>> >>>>>>> Does anyone have any other alternatives? Preference for one of the >>>>>>> above options? >>>>>>> >>>>>>> Thanks! >>>>>>> >>>>>>> On Fri, Jul 22, 2022 at 11:21 AM Caleb Rackliffe < >>>>>>> calebrackli...@gmail.com> wrote: >>>>>>> >>>>>>>> Avi brought up an interesting point around NULLness checking in >>>>>>>> CASSANDRA-17762 >>>>>>>> <https://issues.apache.org/jira/browse/CASSANDRA-17762>... >>>>>>>> >>>>>>>> In SQL, any comparison with NULL is NULL, which is interpreted as >>>>>>>>> FALSE in a condition. To test for NULLness, you use IS NULL or IS NOT >>>>>>>>> NULL. >>>>>>>>> But LWT uses IF col = NULL as a NULLness test. This is likely to >>>>>>>>> confuse >>>>>>>>> people coming from SQL and hamper attempts to extend the dialect. >>>>>>>> >>>>>>>> >>>>>>>> We can leave that Jira open to address what to do in the legacy LWT >>>>>>>> case, but I'd support a SQL-congruent syntax here (IS NULL or IS >>>>>>>> NOT NULL), where we have something closer to a blank slate. >>>>>>>> >>>>>>>> Thoughts? >>>>>>>> >>>>>>>> On Thu, Jun 30, 2022 at 6:25 PM Abe Ratnofsky <a...@aber.io> wrote: >>>>>>>> >>>>>>>>> The new syntax looks great, and I’m really excited to see this >>>>>>>>> coming together. >>>>>>>>> >>>>>>>>> One piece of feedback on the proposed syntax is around the use of >>>>>>>>> “=“ as a declaration in addition to its current use as an equality >>>>>>>>> operator >>>>>>>>> in a WHERE clause and an assignment operator in an UPDATE: >>>>>>>>> >>>>>>>>> BEGIN TRANSACTION >>>>>>>>> LET car_miles = miles_driven, car_is_running = is_running FROM >>>>>>>>> cars WHERE model=’pinto’ >>>>>>>>> LET user_miles = miles_driven FROM users WHERE name=’blake’ >>>>>>>>> SELECT something else from some other table >>>>>>>>> IF NOT car_is_running THEN ABORT >>>>>>>>> UPDATE users SET miles_driven = user_miles + 30 WHERE >>>>>>>>> name='blake'; >>>>>>>>> UPDATE cars SET miles_driven = car_miles + 30 WHERE >>>>>>>>> model='pinto'; >>>>>>>>> COMMIT TRANSACTION >>>>>>>>> >>>>>>>>> This is supported in languages like PL/pgSQL, but in a normal SQL >>>>>>>>> query kind of local declaration is often expressed as an alias >>>>>>>>> (SELECT col >>>>>>>>> AS new_col), subquery alias (SELECT col) t, or common table expression >>>>>>>>> (WITH t AS (SELECT col)). >>>>>>>>> >>>>>>>>> Here’s an example of an alternative to the proposed syntax that >>>>>>>>> I’d find more readable: >>>>>>>>> >>>>>>>>> BEGIN TRANSACTION >>>>>>>>> WITH car_miles, car_is_running AS (SELECT miles_driven, >>>>>>>>> is_running FROM cars WHERE model=’pinto’), >>>>>>>>> user_miles AS (SELECT miles_driven FROM users WHERE >>>>>>>>> name=’blake’) >>>>>>>>> IF NOT car_is_running THEN ABORT >>>>>>>>> UPDATE users SET miles_driven = user_miles + 30 WHERE >>>>>>>>> name='blake'; >>>>>>>>> UPDATE cars SET miles_driven = car_miles + 30 WHERE >>>>>>>>> model='pinto'; >>>>>>>>> COMMIT TRANSACTION >>>>>>>>> >>>>>>>>> There’s also the option of naming the transaction like a subquery, >>>>>>>>> and supporting LET via AS (this one I’m less sure about but wanted to >>>>>>>>> propose anyway): >>>>>>>>> >>>>>>>>> BEGIN TRANSACTION t1 >>>>>>>>> SELECT miles_driven AS t1.car_miles, is_running AS >>>>>>>>> t1.car_is_running FROM cars WHERE model=’pinto’; >>>>>>>>> SELECT miles_driven AS t1.user_miles FROM users WHERE >>>>>>>>> name=’blake’; >>>>>>>>> IF NOT car_is_running THEN ABORT >>>>>>>>> UPDATE users SET miles_driven = user_miles + 30 WHERE >>>>>>>>> name='blake'; >>>>>>>>> UPDATE cars SET miles_driven = car_miles + 30 WHERE >>>>>>>>> model='pinto'; >>>>>>>>> COMMIT TRANSACTION >>>>>>>>> >>>>>>>>> This also has the benefit of resolving ambiguity in case of naming >>>>>>>>> conflicts with existing (or future) column names. >>>>>>>>> >>>>>>>>> -- >>>>>>>>> Abe >>>>>>>>> >>>>>>>> >>>> >>>> >>>> >>>