> I like Benedict's tuple deconstruction idea For posterity, this was Avi’s idea!
> On 15 Aug 2022, at 18:59, Caleb Rackliffe <calebrackli...@gmail.com> wrote: > > Monday Morning Caleb has digested, and here's where I am... > > 1.) I have no problem w/ having SELECT on the RHS of a LET assignment, and to > be honest, this may make some implementation things easier for me (i.e. the > encapsulation of SELECT within LET) > 2.) I'm in favor of LET without a select, although I have no strong feeling > that it needs to be in v1. > 3.) I like Benedict's tuple deconstruction idea, as it restores some of the > notational convenience of the previous proposal. Again, though, I don't have > a strong feeling this needs to be in v1. > 3.b.) When we do implement tuple deconstruction, I'd be in favor of > supporting a single level of deconstruction to begin with. > > Having said all that, on Friday I finished a prototype (based on some of > Blake's previous work) of the syntax/grammar we've more or less agreed upon > here, including an implementation of what I described as option #5 above: > https://github.com/maedhroz/cassandra/commits/CASSANDRA-17719-prototype > > To look at specific examples, see these tests: > https://github.com/maedhroz/cassandra/blob/CASSANDRA-17719-prototype/test/distributed/org/apache/cassandra/distributed/test/accord/AccordIntegrationTest.java > > There are only two things that aren't yet congruent w/ our discussion above, > but they should both be trivial to fix: > > 1.) I'm still using EXISTS/NOT EXISTS instead of IS NOT NULL/IS NULL. > 2.) I don't require SELECT on the RHS of LET yet. > > If I were to just fix those two items, would we be in agreement on this being > both the core of the syntax we want and compatible w/ the wish list for > future items? > > > On Sun, Aug 14, 2022 at 12:25 PM Benedict Elliott Smith <bened...@apache.org> > wrote: >> >> >>> >>> Verbose version: >>> LET (a) = SELECT val FROM table >>> IF a > 1 THEN... >>> >>> Less verbose version: >>> LET a = SELECT val FROM table >>> IF a.val > 1 THEN... >> >> >> My intention is that these are actually two different ways of expressing the >> same thing, both supported and neither intended to be more or less verbose >> than the other. The advantage of permitting both is that you can also write >> >> LET a = SELECT val FROM table >> IF a IS NOT NULL AND a.val IS NULL THEN … >> >>> Alternatively, for non-queries: >>> LET x = SELECT someFunc() AS v1, someOtherFunc() AS v2 >>> or less verbose: >>> LET x = (someFunc() AS v1, someOtherFunc() as v2) >>> LET (v1, v2) = (someFunc(), someOtherFunc()) >> >> I personally prefer clarity over any arbitrary verbosity/succinct >> distinction, but we’re in general “taste” territory here. Since this syntax >> includes the SELECT on the RHS, it makes sense to only require this for >> situations where a query is being performed. Though I think if SELECT >> without a FROM is supported then we will likely end up supporting all of the >> above. >> >>> Weighing in on the "SELECT without a FROM," I think that is fine and, as >>> Avi stated >> >> Yep, definitely fine. Question is just whether we bother to offer it. Also, >> evidently, whether we support LET without a SELECT on the RHS. I am strongly >> in favour of this, as requiring a SELECT even when there’s no table involved >> is counter-intuitive to me, as LET is now a distinct concept that looks like >> variable declaration in other languages. >> >>> Nested: >>> LET (x, y) = SELECT x, y FROM… >> >> Deconstruction here refers to the above, i.e. extracting variables x and y >> from the tuple on the RHS >> >> Nesting is just a question of whether we support either nested tuple >> declarations, or nested deconstruction, which might include any of the >> following: >> >> LET (x, (y, z)) = SELECT (x, (y, z)) FROM… >> LET (x, (y, z)) = SELECT x, someTuple FROM… >> LET (x, (y, z)) = (SELECT x FROM.., SELECT y, x FROM…)) >> LET (x, (y, z)) = (someFunc(), SELECT y, z FROM…) >> LET (x, yAndZ) = (someFunc(), SELECT y, z FROM…) >> >> IMO, once you start supporting features they need to be sort of intuitively >> discoverable by users, so that a concept can be used in all places you might >> expect. >> >> But I would be fine with an arbitrary restriction of at most one SELECT on >> the RHS, or even ONLY a SELECT or some other tuple, and at most one level of >> deconstruction of the RHS. >> >> >> >> >> >>> On 14 Aug 2022, at 18:04, Patrick McFadin <pmcfa...@gmail.com> wrote: >>> >>> Let me just state my bias right up front. For any kind of QL I lean heavily >>> toward verbose and explicit based on their lifecycle. A CQL query will >>> probably need to be understood by the next person looking at it, and a few >>> seconds saved typing isn't worth the potential misunderstanding later. My >>> opinion is formed by having to be the second person many times. :D >>> >>> I just want to make sure I have the syntax you are proposing. >>> >>> Verbose version: >>> LET (a) = SELECT val FROM table >>> IF a > 1 THEN... >>> >>> Less verbose version: >>> LET a = SELECT val FROM table >>> IF a.val > 1 THEN... >>> >>> Alternatively, for non-queries: >>> LET x = SELECT someFunc() AS v1, someOtherFunc() AS v2 >>> or less verbose: >>> LET x = (someFunc() AS v1, someOtherFunc() as v2) >>> LET (v1, v2) = (someFunc(), someOtherFunc()) >>> >>> Weighing in on the "SELECT without a FROM," I think that is fine and, as >>> Avi stated, already present in the SQL world. I would prefer that over >>> 'SELECT func() FROM dual;' (Looking at you, Oracle) >>> >>> Finally, on the topic of deconstructing SELECT statements instead of >>> nesting. If I understand the argument here, I would favor deconstructing >>> over nesting if there is a choice. I think this is what that choice would >>> look like. >>> >>> Deconstructed: >>> LET x = SELECT x FROM ... >>> LET y = SELECT y FROM ... >>> >>> Nested: >>> LET (x, y) = ((SELECT x FROM…), (SELECT y FROM)) >>> >>> I'm trying to summate but let me know if I missed something. I apologize in >>> advance to Monday morning Caleb, who will have to digest this thread. >>> >>> Patrick >>> >>> On Sun, Aug 14, 2022 at 9:00 AM Benedict Elliott Smith >>> <bened...@apache.org> wrote: >>>> >>>>> >>>>> I think SQL dialects require subqueries to be parenthesized (not sure). >>>>> If that's the case I think we should keep the tradition. >>>>> >>>> >>>> This isn’t a sub-query though, since LET is not a query. If we permit at >>>> most one SELECT, and do not permit mixing SELECT with constant >>>> assignments, I don’t see why we would require parentheses. >>>> >>>>> I see no harm in making FROM optional, as it's recognized by other SQL >>>>> dialects. >>>>> >>>>> Absolutely, this just flows naturally from having tuples. There's no >>>>> difference between "SELECT (a, b)" and "SELECT a_but_a_is_a_tuple”. >>>> >>>> Neither of these things are supported today, and they’re no longer >>>> necessary with this syntax proposal. The downside of splitting SELECT and >>>> LET is that there’s no impetus to improve the former. So the question was >>>> really whether we bother to improve it anyway, not whether or not they >>>> would be good improvements (I think they obviously are). >>>> >>>>> I think this can be safely deferred. Most people would again separate it >>>>> into separate LETs. >>>>> >>>> That implies we’ll permit deconstructing a tuple variable in a LET. This >>>> makes sense to me, but is roughly equivalent to nested deconstruction. It >>>> might be that v1 we only support deconstructing SELECT statements, but I >>>> guess all of this is probably up to the implementor. >>>>> I'd add (to the specification) that LETs cannot override a previously >>>>> defined variable, just to reduce ambiguity. >>>>> >>>> >>>> Yep, this was already agreed way back with the earlier proposal. >>>> >>>> >>>>> On 14 Aug 2022, at 16:30, Avi Kivity <a...@scylladb.com> wrote: >>>>> >>>>> >>>>> >>>>> On 14/08/2022 17.50, Benedict Elliott Smith wrote: >>>>>> >>>>>> > SELECT and LET incompatible once comparisons become valid selectors >>>>>> >>>>>> I don’t think this would be ambiguous, as = is required in the LET >>>>>> syntax as we have to bind the result to a variable name. >>>>>> >>>>>> But, I like the deconstructed tuple syntax improvement over “Option 6”. >>>>>> This would also seem to easily support assigning from non-query >>>>>> statements, such as LET (a, b) = (someFunc(), someOtherFunc(?)) >>>>>> >>>>>> I don’t think it is ideal to depend on relative position in the tuple >>>>>> for assigning results to a variable name, as it leaves more scope for >>>>>> errors. It would be nice to have a simple way to deconstruct safely. >>>>>> But, I think this proposal is good, and I’d be fine with it as an >>>>>> alternative if others concur. I agree that seeing the SELECT >>>>>> independently may be more easily recognisable to users. >>>>>> >>>>>> With this approach there remains the question of how we handle single >>>>>> column results. I’d be inclined to treat in the following way: >>>>>> >>>>>> LET (a) = SELECT val FROM table >>>>>> IF a > 1 THEN... >>>>>> >>>>>> LET a = SELECT val FROM table >>>>>> IF a.val > 1 THEN... >>>>>> >>>>> >>>>> I think SQL dialects require subqueries to be parenthesized (not sure). >>>>> If that's the case I think we should keep the tradition. >>>>> >>>>> >>>>> >>>>>> ---- >>>>>> There is also the question of whether we support SELECT without a FROM >>>>>> clause, e.g. >>>>>> LET x = SELECT someFunc() AS v1, someOtherFunc() AS v2 >>>>>> >>>>>> Or just LET (since they are no longer equivalent) >>>>>> e.g. >>>>>> LET x = (someFunc() AS v1, someOtherFunc() as v2) >>>>>> LET (v1, v2) = (someFunc(), someOtherFunc()) >>>>>> >>>>> >>>>> I see no harm in making FROM optional, as it's recognized by other SQL >>>>> dialects. >>>>> >>>>> >>>>> >>>>>> ---- >>>>>> Also since LET is only binding variables, is there any reason we >>>>>> shouldn’t support multiple SELECT assignments in a single LET?, e.g. >>>>>> LET (x, y) = ((SELECT x FROM…), (SELECT y FROM)) >>>>>> >>>>> >>>>> What if an inner select returns a tuple? Would y be a tuple? >>>>> >>>>> >>>>> >>>>> I think this is redundant and atypical enough to not be worth supporting. >>>>> Most people would use separate LETs. >>>>> >>>>> >>>>> >>>>>> ---- >>>>>> Also whether we support tuples in SELECT statements anyway, e.g. >>>>>> LET (tuple1, tuple2) = SELECT (a, b), (c, d) FROM.. >>>>>> IF tuple1.a > 1 AND tuple2.d > 1… >>>>> >>>>> Absolutely, this just flows naturally from having tuples. There's no >>>>> difference between "SELECT (a, b)" and "SELECT a_but_a_is_a_tuple". >>>>> >>>>> >>>>> >>>>>> >>>>>> ---- >>>>>> and whether we support nested deconstruction, e.g. >>>>>> LET (a, b, (c, d)) = SELECT a, b, someTuple FROM.. >>>>>> IF a > 1 AND d > 1… >>>>>> >>>>> >>>>> I think this can be safely deferred. Most people would again separate it >>>>> into separate LETs. >>>>> >>>>> >>>>> >>>>> I'd add (to the specification) that LETs cannot override a >>>>> previously defined variable, just to reduce ambiguity. >>>>> >>>>> >>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>>> On 14 Aug 2022, at 13:55, Avi Kivity via dev <dev@cassandra.apache.org> >>>>>>> wrote: >>>>>>> >>>>>>> >>>>>>> >>>>>>> On 14/08/2022 01.29, Benedict Elliott Smith wrote: >>>>>>>> >>>>>>>> I’ll do my best to express with my thinking, as well as how I would >>>>>>>> explain the feature to a user. >>>>>>>> >>>>>>>> My mental model for LET statements is that they are simply SELECT >>>>>>>> statements where the columns that are selected become variables >>>>>>>> accessible anywhere in the scope of the transaction. That is to say, >>>>>>>> you should be able to run something like s/LET/SELECT and >>>>>>>> s/([^=]+)=([^,]+)(,|$)/\2 AS \1\3/g on the columns of a LET statement >>>>>>>> and produce a valid SELECT statement, and vice versa. Both should >>>>>>>> perform identically. >>>>>>>> >>>>>>>> e.g. >>>>>>>> SELECT pk AS key, v AS value FROM table >>>>>>>> >>>>>>>> => >>>>>>>> LET key = pk, value = v FROM table >>>>>>> >>>>>>> "=" is a CQL/SQL operator. Cassandra doesn't support it yet, but SQL >>>>>>> supports selecting comparisons: >>>>>>> >>>>>>> >>>>>>> >>>>>>> $ psql >>>>>>> psql (14.3) >>>>>>> Type "help" for help. >>>>>>> >>>>>>> avi=# SELECT 1 = 2, 3 = 3, NULL = NULL; >>>>>>> ?column? | ?column? | ?column? >>>>>>> ----------+----------+---------- >>>>>>> f | t | >>>>>>> (1 row) >>>>>>> >>>>>>> >>>>>>> >>>>>>> Using "=" as a syntactic element in LET would make SELECT and LET >>>>>>> incompatible once comparisons become valid selectors. Unless they >>>>>>> become mandatory (and then you'd write "LET q = a = b" if you wanted to >>>>>>> select a comparison). >>>>>>> >>>>>>> >>>>>>> >>>>>>> I personally prefer the nested query syntax: >>>>>>> >>>>>>> >>>>>>> >>>>>>> LET (a, b, c) = (SELECT foo, bar, x+y FROM ...); >>>>>>> >>>>>>> >>>>>>> >>>>>>> So there aren't two similar-but-not-quite-the-same syntaxes. SELECT is >>>>>>> immediately recognizable by everyone as a query, LET is not. >>>>>>> >>>>>>> >>>>>>> >>>>>>>> >>>>>>>> Identical form, identical behaviour. Every statement should be >>>>>>>> directly translatable with some simple text manipulation. >>>>>>>> >>>>>>>> We can then make this more powerful for users by simply expanding >>>>>>>> SELECT statements, e.g. by permitting them to declare constants and >>>>>>>> tuples in the column results. In this scheme LET x = * is simply >>>>>>>> syntactic sugar for LET x = (pk, ck, field1, …) This scheme then >>>>>>>> supports options 2, 4 and 5 all at once, consistently alongside each >>>>>>>> other. >>>>>>>> >>>>>>>> Option 6 is in fact very similar, but is strictly less flexible for >>>>>>>> the user as they have no way to declare multiple scalar variables >>>>>>>> without scoping them inside a tuple. >>>>>>>> >>>>>>>> e.g. >>>>>>>> LET key = pk, value = v FROM table >>>>>>>> IF key > 1 AND value > 1 THEN... >>>>>>>> >>>>>>>> => >>>>>>>> LET row = SELECT pk AS key, v AS value FROM table >>>>>>>> IF row.key > 1 AND row.value > 1 THEN… >>>>>>>> >>>>>>>> However, both are expressible in the existing proposal, as if you >>>>>>>> prefer this naming scheme you can simply write >>>>>>>> >>>>>>>> LET row = (pk AS key, v AS value) FROM table >>>>>>>> IF row.key > 1 AND row.value > 1 THEN… >>>>>>>> >>>>>>>> With respect to auto converting single column results to a scalar, we >>>>>>>> do need a way for the user to say they care whether the row was null >>>>>>>> or the column. I think an implicit conversion here could be >>>>>>>> surprising. However we could implement tuple expressions anyway and >>>>>>>> let the user explicitly declare v as a tuple as Caleb has suggested >>>>>>>> for the existing proposal as well. >>>>>>>> >>>>>>>> Assigning constants or other values not selected from a table would >>>>>>>> also be a little clunky: >>>>>>>> >>>>>>>> LET v1 = someFunc(), v2 = someOtherFunc(?) >>>>>>>> IF v1 > 1 AND v2 > 1 THEN… >>>>>>>> >>>>>>>> => >>>>>>>> LET row = SELECT someFunc() AS v1, someOtherFunc(?) AS v2 >>>>>>>> IF row.v1 > 1 AND row.v2 > 1 THEN... >>>>>>>> >>>>>>>> That said, the proposals are close to identical, it is just slightly >>>>>>>> more verbose and slightly less flexible. >>>>>>>> >>>>>>>> Which one would be most intuitive to users is hard to predict. It >>>>>>>> might be that Option 6 would be slightly easier, but I’m unsure if >>>>>>>> there would be a huge difference. >>>>>>>> >>>>>>>> >>>>>>>>> On 13 Aug 2022, at 16:59, Patrick McFadin <pmcfa...@gmail.com> wrote: >>>>>>>>> >>>>>>>>> I'm really happy to see CEP-15 getting closer to a final >>>>>>>>> implementation. I'm going to walk through my reasoning for your >>>>>>>>> proposals wrt trying to explain this to somebody new. >>>>>>>>> >>>>>>>>> Looking at all the options, the first thing that comes up for me is >>>>>>>>> the Cassandra project's complicated relationship with NULL. We have >>>>>>>>> prior art with EXISTS/NOT EXISTS when creating new tables. IS NULL/IS >>>>>>>>> NOT NULL is used in materialized views similarly to proposals 2,4 and >>>>>>>>> 5. >>>>>>>>> >>>>>>>>> CREATE MATERIALIZED VIEW [ IF NOT EXISTS ] [keyspace_name.]view_name >>>>>>>>> AS SELECT [ (column_list) ] >>>>>>>>> FROM [keyspace_name.]table_name >>>>>>>>> [ WHERE column_name IS NOT NULL >>>>>>>>> [ AND column_name IS NOT NULL ... ] ] >>>>>>>>> [ AND relation [ AND ... ] ] >>>>>>>>> PRIMARY KEY ( column_list ) >>>>>>>>> [ WITH [ table_properties ] >>>>>>>>> [ [ AND ] CLUSTERING ORDER BY (cluster_column_name order_option) ] >>>>>>>>> ] ; >>>>>>>>> >>>>>>>>> Based on that, I believe 1 and 3 would just confuse users, so -1 on >>>>>>>>> those. >>>>>>>>> >>>>>>>>> Trying to explain the difference between row and column operations >>>>>>>>> with LET, I can't see the difference between a row and column in #2. >>>>>>>>> >>>>>>>>> #4 introduces a boolean instead of column names and just adds more >>>>>>>>> syntax. >>>>>>>>> >>>>>>>>> #5 is verbose and, in my opinion, easier to reason when writing a >>>>>>>>> query. Thinking top down, I need to know if these exact rows and/or >>>>>>>>> column values exist before changing them, so I'll define them first. >>>>>>>>> Then I'll iterate over the state I created in my actual changes so I >>>>>>>>> know I'm changing precisely what I want. >>>>>>>>> >>>>>>>>> #5 could use a bit more to be clearer to somebody who doesn't write >>>>>>>>> CQL queries daily and wouldn't require memorizing subtle differences. >>>>>>>>> It should be similar to all the other syntax, so learning a little >>>>>>>>> about CQL will let you move into more without completely re-learning >>>>>>>>> the new syntax. >>>>>>>>> >>>>>>>>> So I propose #6) >>>>>>>>> BEGIN TRANSACTION >>>>>>>>> LET row1 = SELECT * FROM ks.tbl WHERE k=0 AND c=0; <-- * selects >>>>>>>>> all columns >>>>>>>>> LET row2 = SELECT v FROM ks.tbl WHERE k=1 AND c=0; >>>>>>>>> SELECT row1, row2 >>>>>>>>> IF row1 IS NULL AND row2.v = 3 THEN >>>>>>>>> INSERT INTO ks.tbl (k, c, v) VALUES (0, 0, 1); >>>>>>>>> END IF >>>>>>>>> COMMIT TRANSACTION >>>>>>>>> >>>>>>>>> I added the SELECT in the LET just so it's straightforward, you are >>>>>>>>> reading, and it's just like doing a regular select, but you are >>>>>>>>> assigning it to a variable. >>>>>>>>> >>>>>>>>> I removed the confusing 'row1.v' and replaced it with 'row1' I can't >>>>>>>>> see why you would need the '.v' vs having the complete variable I >>>>>>>>> created in the statement above. >>>>>>>>> >>>>>>>>> EOL >>>>>>>>> >>>>>>>>> Patrick >>>>>>>>> >>>>>>>>> On Thu, Aug 11, 2022 at 1:37 PM Caleb Rackliffe >>>>>>>>> <calebrackli...@gmail.com> wrote: >>>>>>>>>> ...and one more option... >>>>>>>>>> >>>>>>>>>> 5.) Introduce tuple assignments, removing all ambiguity around row >>>>>>>>>> vs. column operations. >>>>>>>>>> >>>>>>>>>> BEGIN TRANSACTION >>>>>>>>>> LET row1 = * FROM ks.tbl WHERE k=0 AND c=0; <-- * selects all >>>>>>>>>> columns >>>>>>>>>> LET row2 = (v) FROM ks.tbl WHERE k=1 AND c=0; >>>>>>>>>> SELECT row1.v, row2.v >>>>>>>>>> IF row1 IS NULL AND row2.v = 3 THEN >>>>>>>>>> INSERT INTO ks.tbl (k, c, v) VALUES (0, 0, 1); >>>>>>>>>> END IF >>>>>>>>>> COMMIT TRANSACTION >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Thu, Aug 11, 2022 at 12:55 PM Caleb Rackliffe >>>>>>>>>> <calebrackli...@gmail.com> wrote: >>>>>>>>>>> via Benedict, here is a 4th option: >>>>>>>>>>> >>>>>>>>>>> 4.) Similar to #2, but don't rely on the key element being NULL. >>>>>>>>>>> >>>>>>>>>>> If the read returns no result, x effectively becomes NULL. >>>>>>>>>>> Otherwise, it remains true/NOT NULL. >>>>>>>>>>> >>>>>>>>>>> BEGIN TRANSACTION >>>>>>>>>>> LET x = true FROM ks.tbl WHERE k=0 AND c=0; >>>>>>>>>>> LET row2_v = v FROM ks.tbl WHERE k=1 AND c=0; >>>>>>>>>>> SELECT x, row2_v >>>>>>>>>>> IF x IS NULL AND row2_v = 3 THEN >>>>>>>>>>> INSERT INTO ks.tbl (k, c, v) VALUES (0, 0, 1); >>>>>>>>>>> END IF >>>>>>>>>>> COMMIT TRANSACTION >>>>>>>>>>> >>>>>>>>>>> On Thu, Aug 11, 2022 at 12:12 PM Caleb Rackliffe >>>>>>>>>>> <calebrackli...@gmail.com> wrote: >>>>>>>>>>>> Hello again everyone! >>>>>>>>>>>> >>>>>>>>>>>> I've been working on a prototype in CASSANDRA-17719 for a grammar >>>>>>>>>>>> that roughly corresponds to what we've agreed on in this thread. >>>>>>>>>>>> One thing that isn't immediately obvious to me is how the LET >>>>>>>>>>>> syntax handles cases where we want to check for the plain >>>>>>>>>>>> existence of a row in IF. For example, in this hybrid of the >>>>>>>>>>>> originally proposed syntax and something more like what we've >>>>>>>>>>>> agreed on (and the RETURNING just to distinguish >>>>>>>>>>>> between that and SELECT), this >>>>>>>>>>>> could be pretty straightforward: >>>>>>>>>>>> >>>>>>>>>>>> BEGIN TRANSACTION >>>>>>>>>>>> SELECT v FROM ks.tbl WHERE k=0 AND c=0 AS row1; >>>>>>>>>>>> SELECT v FROM ks.tbl WHERE k=1 AND c=0 AS row2; >>>>>>>>>>>> RETURNING row1.v, row2.v >>>>>>>>>>>> IF row1 NOT EXISTS AND row2.v = 3 THEN >>>>>>>>>>>> INSERT INTO ks.tbl (k, c, v) VALUES (0, 0, 1); >>>>>>>>>>>> END IF >>>>>>>>>>>> COMMIT TRANSACTION >>>>>>>>>>>> >>>>>>>>>>>> The NOT EXISTS operator has row1 to work with. One the other hand, >>>>>>>>>>>> w/ the LET syntax and no naming of reads, it's not clear what the >>>>>>>>>>>> best solution would be. Here are a few possibilities: >>>>>>>>>>>> >>>>>>>>>>>> 1.) Provide a few built-in functions that operate on a whole >>>>>>>>>>>> result row. If we assume a SQL style IS NULL and IS NOT NULL (see >>>>>>>>>>>> my last post here) for operations on particular columns, this >>>>>>>>>>>> probably eliminates the need for EXISTS/NOT EXISTS as well. >>>>>>>>>>>> >>>>>>>>>>>> BEGIN TRANSACTION >>>>>>>>>>>> LET row1_missing = notExists() FROM ks.tbl WHERE k=0 AND c=0; >>>>>>>>>>>> LET row2_v = v FROM ks.tbl WHERE k=1 AND c=0; >>>>>>>>>>>> SELECT row1_missing, row2_v >>>>>>>>>>>> IF row1_missing AND row2_v = 3 THEN >>>>>>>>>>>> INSERT INTO ks.tbl (k, c, v) VALUES (0, 0, 1); >>>>>>>>>>>> END IF >>>>>>>>>>>> COMMIT TRANSACTION >>>>>>>>>>>> >>>>>>>>>>>> 2.) Assign and check the first primary key element to determine >>>>>>>>>>>> whether the row exists. >>>>>>>>>>>> >>>>>>>>>>>> BEGIN TRANSACTION >>>>>>>>>>>> LET row1_k = k FROM ks.tbl WHERE k=0 AND c=0; >>>>>>>>>>>> LET row2_v = v FROM ks.tbl WHERE k=1 AND c=0; >>>>>>>>>>>> SELECT row1_k, row2_v >>>>>>>>>>>> IF row1_k IS NULL AND row2_v = 3 THEN >>>>>>>>>>>> INSERT INTO ks.tbl (k, c, v) VALUES (0, 0, 1); >>>>>>>>>>>> END IF >>>>>>>>>>>> COMMIT TRANSACTION >>>>>>>>>>>> >>>>>>>>>>>> 3.) Reconsider the LET concept toward something >>>>>>>>>>>> that allows us to explicitly >>>>>>>>>>>> name our reads again. >>>>>>>>>>>> >>>>>>>>>>>> BEGIN TRANSACTION >>>>>>>>>>>> WITH (SELECT v FROM ks.tbl WHERE k=0 AND c=0) AS row1; >>>>>>>>>>>> WITH (SELECT v FROM ks.tbl WHERE k=1 AND c=0) AS row2; >>>>>>>>>>>> SELECT row1.v, row2.v >>>>>>>>>>>> IF row1 NOT EXISTS AND row2.v = 3 THEN >>>>>>>>>>>> INSERT INTO ks.tbl (k, c, v) VALUES (0, 0, 1); >>>>>>>>>>>> END IF >>>>>>>>>>>> COMMIT TRANSACTION >>>>>>>>>>>> >>>>>>>>>>>> I don't have a strong affinity for any of these, although #1 seems >>>>>>>>>>>> the most awkward. >>>>>>>>>>>> >>>>>>>>>>>> Does anyone have any other alternatives? Preference for one of the >>>>>>>>>>>> above options? >>>>>>>>>>>> >>>>>>>>>>>> Thanks! >>>>>>>>>>>> >>>>>>>>>>>> On Fri, Jul 22, 2022 at 11:21 AM Caleb Rackliffe >>>>>>>>>>>> <calebrackli...@gmail.com> wrote: >>>>>>>>>>>>> Avi brought up an interesting point around NULLness checking in >>>>>>>>>>>>> CASSANDRA-17762... >>>>>>>>>>>>> >>>>>>>>>>>>>> In SQL, any comparison with NULL is NULL, which is interpreted >>>>>>>>>>>>>> as FALSE in a condition. To test for NULLness, you use IS NULL >>>>>>>>>>>>>> or IS NOT NULL. But LWT uses IF col = NULL as a NULLness test. >>>>>>>>>>>>>> This is likely to confuse people coming from SQL and hamper >>>>>>>>>>>>>> attempts to extend the dialect. >>>>>>>>>>>>> >>>>>>>>>>>>> We can leave that Jira open to address what to do in the legacy >>>>>>>>>>>>> LWT case, but I'd support a SQL-congruent syntax here (IS NULL or >>>>>>>>>>>>> IS NOT NULL), where we have something closer to a blank slate. >>>>>>>>>>>>> >>>>>>>>>>>>> Thoughts? >>>>>>>>>>>>> >>>>>>>>>>>>> On Thu, Jun 30, 2022 at 6:25 PM Abe Ratnofsky <a...@aber.io> >>>>>>>>>>>>> wrote: >>>>>>>>>>>>>> The new syntax looks great, and I’m really excited to see this >>>>>>>>>>>>>> coming together. >>>>>>>>>>>>>> >>>>>>>>>>>>>> One piece of feedback on the proposed syntax is around the use >>>>>>>>>>>>>> of “=“ as a declaration in addition to its current use as an >>>>>>>>>>>>>> equality operator in a WHERE clause and an assignment operator >>>>>>>>>>>>>> in an UPDATE: >>>>>>>>>>>>>> >>>>>>>>>>>>>> BEGIN TRANSACTION >>>>>>>>>>>>>> LET car_miles = miles_driven, car_is_running = is_running FROM >>>>>>>>>>>>>> cars WHERE model=’pinto’ >>>>>>>>>>>>>> LET user_miles = miles_driven FROM users WHERE name=’blake’ >>>>>>>>>>>>>> SELECT something else from some other table >>>>>>>>>>>>>> IF NOT car_is_running THEN ABORT >>>>>>>>>>>>>> UPDATE users SET miles_driven = user_miles + 30 WHERE >>>>>>>>>>>>>> name='blake'; >>>>>>>>>>>>>> UPDATE cars SET miles_driven = car_miles + 30 WHERE >>>>>>>>>>>>>> model='pinto'; >>>>>>>>>>>>>> COMMIT TRANSACTION >>>>>>>>>>>>>> >>>>>>>>>>>>>> This is supported in languages like PL/pgSQL, but in a normal >>>>>>>>>>>>>> SQL query kind of local declaration is often expressed as an >>>>>>>>>>>>>> alias (SELECT col AS new_col), subquery alias (SELECT col) t, or >>>>>>>>>>>>>> common table expression (WITH t AS (SELECT col)). >>>>>>>>>>>>>> >>>>>>>>>>>>>> Here’s an example of an alternative to the proposed syntax that >>>>>>>>>>>>>> I’d find more readable: >>>>>>>>>>>>>> >>>>>>>>>>>>>> BEGIN TRANSACTION >>>>>>>>>>>>>> WITH car_miles, car_is_running AS (SELECT miles_driven, >>>>>>>>>>>>>> is_running FROM cars WHERE model=’pinto’), >>>>>>>>>>>>>> user_miles AS (SELECT miles_driven FROM users WHERE >>>>>>>>>>>>>> name=’blake’) >>>>>>>>>>>>>> IF NOT car_is_running THEN ABORT >>>>>>>>>>>>>> UPDATE users SET miles_driven = user_miles + 30 WHERE >>>>>>>>>>>>>> name='blake'; >>>>>>>>>>>>>> UPDATE cars SET miles_driven = car_miles + 30 WHERE >>>>>>>>>>>>>> model='pinto'; >>>>>>>>>>>>>> COMMIT TRANSACTION >>>>>>>>>>>>>> >>>>>>>>>>>>>> There’s also the option of naming the transaction like a >>>>>>>>>>>>>> subquery, and supporting LET via AS (this one I’m less sure >>>>>>>>>>>>>> about but wanted to propose anyway): >>>>>>>>>>>>>> >>>>>>>>>>>>>> BEGIN TRANSACTION t1 >>>>>>>>>>>>>> SELECT miles_driven AS t1.car_miles, is_running AS >>>>>>>>>>>>>> t1.car_is_running FROM cars WHERE model=’pinto’; >>>>>>>>>>>>>> SELECT miles_driven AS t1.user_miles FROM users WHERE >>>>>>>>>>>>>> name=’blake’; >>>>>>>>>>>>>> IF NOT car_is_running THEN ABORT >>>>>>>>>>>>>> UPDATE users SET miles_driven = user_miles + 30 WHERE >>>>>>>>>>>>>> name='blake'; >>>>>>>>>>>>>> UPDATE cars SET miles_driven = car_miles + 30 WHERE >>>>>>>>>>>>>> model='pinto'; >>>>>>>>>>>>>> COMMIT TRANSACTION >>>>>>>>>>>>>> >>>>>>>>>>>>>> This also has the benefit of resolving ambiguity in case of >>>>>>>>>>>>>> naming conflicts with existing (or future) column names. >>>>>>>>>>>>>> >>>>>>>>>>>>>> -- >>>>>>>>>>>>>> Abe