Re: CEP-15 multi key transaction syntax

Benedict Elliott Smith Sun, 14 Aug 2022 10:25:38 -0700



> 
> Verbose version:
> LET (a) = SELECT val FROM table
> IF a > 1 THEN...
> 
> Less verbose version:
> LET a = SELECT val FROM table
> IF a.val > 1 THEN...



My intention is that these are actually two different ways of expressing the 
same thing, both supported and neither intended to be more or less verbose than 
the other. The advantage of permitting both is that you can also write

LET a = SELECT val FROM table
IF a IS NOT NULL AND a.val IS NULL THEN …

> Alternatively, for non-queries:
> LET x = SELECT someFunc() AS v1, someOtherFunc() AS v2
> or less verbose:
> LET x = (someFunc() AS v1, someOtherFunc() as v2)
> LET (v1, v2) = (someFunc(), someOtherFunc())

I personally prefer clarity over any arbitrary verbosity/succinct distinction, 
but we’re in general “taste” territory here. Since this syntax includes the 
SELECT on the RHS, it makes sense to only require this for situations where a 
query is being performed. Though I think if SELECT without a FROM is supported 
then we will likely end up supporting all of the above.

> Weighing in on the "SELECT without a FROM," I think that is fine and, as Avi 
> stated

Yep, definitely fine. Question is just whether we bother to offer it. Also, 
evidently, whether we support LET without a SELECT on the RHS. I am strongly in 
favour of this, as requiring a SELECT even when there’s no table involved is 
counter-intuitive to me, as LET is now a distinct concept that looks like 
variable declaration in other languages.

> Nested:
> LET (x, y) = SELECT x, y FROM…

Deconstruction here refers to the above, i.e. extracting variables x and y from 
the tuple on the RHS

Nesting is just a question of whether we support either nested tuple 
declarations, or nested deconstruction, which might include any of the 
following:

LET (x, (y, z)) = SELECT (x, (y, z)) FROM…
LET (x, (y, z)) = SELECT x, someTuple FROM…
LET (x, (y, z)) = (SELECT x FROM.., SELECT y, x FROM…))
LET (x, (y, z)) = (someFunc(), SELECT y, z FROM…)
LET (x, yAndZ) = (someFunc(), SELECT y, z FROM…)

IMO, once you start supporting features they need to be sort of intuitively 
discoverable by users, so that a concept can be used in all places you might 
expect.

But I would be fine with an arbitrary restriction of at most one SELECT on the 
RHS, or even ONLY a SELECT or some other tuple, and at most one level of 
deconstruction of the RHS.





> On 14 Aug 2022, at 18:04, Patrick McFadin <[email protected]> wrote:
> 
> Let me just state my bias right up front. For any kind of QL I lean heavily 
> toward verbose and explicit based on their lifecycle. A CQL query will 
> probably need to be understood by the next person looking at it, and a few 
> seconds saved typing isn't worth the potential misunderstanding later.  My 
> opinion is formed by having to be the second person many times.  :D 
> 
> I just want to make sure I have the syntax you are proposing. 
> 
> Verbose version:
> LET (a) = SELECT val FROM table
> IF a > 1 THEN...
> 
> Less verbose version:
> LET a = SELECT val FROM table
> IF a.val > 1 THEN...
> 
> Alternatively, for non-queries:
> LET x = SELECT someFunc() AS v1, someOtherFunc() AS v2
> or less verbose:
> LET x = (someFunc() AS v1, someOtherFunc() as v2)
> LET (v1, v2) = (someFunc(), someOtherFunc())
> 
> Weighing in on the "SELECT without a FROM," I think that is fine and, as Avi 
> stated, already present in the SQL world. I would prefer that over 'SELECT  
> func() FROM dual;' (Looking at you, Oracle)
> 
> Finally, on the topic of deconstructing SELECT statements instead of nesting. 
> If I understand the argument here, I would favor deconstructing over nesting 
> if there is a choice. I think this is what that choice would look like.
> 
> Deconstructed:
> LET x = SELECT x FROM ...
> LET y = SELECT y FROM ...
> 
> Nested:
> LET (x, y) = ((SELECT x FROM…), (SELECT y FROM))
> 
> I'm trying to summate but let me know if I missed something. I apologize in 
> advance to Monday morning Caleb, who will have to digest this thread. 
> 
> Patrick
> 
> On Sun, Aug 14, 2022 at 9:00 AM Benedict Elliott Smith <[email protected]> 
> wrote:
>> 
>>> 
>>> I think SQL dialects require subqueries to be parenthesized (not sure). If 
>>> that's the case I think we should keep the tradition.
>>> 
>> 
>> This isn’t a sub-query though, since LET is not a query. If we permit at 
>> most one SELECT, and do not permit mixing SELECT with constant assignments, 
>> I don’t see why we would require parentheses.
>> 
>>> I see no harm in making FROM optional, as it's recognized by other SQL 
>>> dialects.
>>> 
>>> Absolutely, this just flows naturally from having tuples. There's no 
>>> difference between "SELECT (a, b)" and "SELECT a_but_a_is_a_tuple”.
>> 
>> Neither of these things are supported today, and they’re no longer necessary 
>> with this syntax proposal. The downside of splitting SELECT and LET is that 
>> there’s no impetus to improve the former. So the question was really whether 
>> we bother to improve it anyway, not whether or not they would be good 
>> improvements (I think they obviously are).
>> 
>>> I think this can be safely deferred. Most people would again separate it 
>>> into separate LETs.
>>> 
>> That implies we’ll permit deconstructing a tuple variable in a LET. This 
>> makes sense to me, but is roughly equivalent to nested deconstruction. It 
>> might be that v1 we only support deconstructing SELECT statements, but I 
>> guess all of this is probably up to the implementor.
>>> I'd add (to the specification) that LETs cannot override a previously 
>>> defined variable, just to reduce ambiguity.
>>> 
>> 
>> Yep, this was already agreed way back with the earlier proposal.
>> 
>> 
>>> On 14 Aug 2022, at 16:30, Avi Kivity <[email protected]> wrote:
>>> 
>>> 
>>> 
>>> On 14/08/2022 17.50, Benedict Elliott Smith wrote:
>>>> 
>>>> > SELECT and LET incompatible once comparisons become valid selectors
>>>> 
>>>> I don’t think this would be ambiguous, as = is required in the LET syntax 
>>>> as we have to bind the result to a variable name.
>>>> 
>>>> But, I like the deconstructed tuple syntax improvement over “Option 6”. 
>>>> This would also seem to easily support assigning from non-query 
>>>> statements, such as LET (a, b) = (someFunc(), someOtherFunc(?))
>>>> 
>>>> I don’t think it is ideal to depend on relative position in the tuple for 
>>>> assigning results to a variable name, as it leaves more scope for errors. 
>>>> It would be nice to have a simple way to deconstruct safely. But, I think 
>>>> this proposal is good, and I’d be fine with it as an alternative if others 
>>>> concur. I agree that seeing the SELECT independently may be more easily 
>>>> recognisable to users.
>>>> 
>>>> With this approach there remains the question of how we handle single 
>>>> column results. I’d be inclined to treat in the           following way:
>>>> 
>>>> LET (a) = SELECT val FROM table
>>>> IF a > 1 THEN...
>>>> 
>>>> LET a = SELECT val FROM table
>>>> IF a.val > 1 THEN...
>>>> 
>>> 
>>> I think SQL dialects require subqueries to be parenthesized (not sure). If 
>>> that's the case I think we should keep the tradition.
>>> 
>>> 
>>> 
>>>> ----
>>>> There is also the question of whether we support SELECT           without 
>>>> a FROM clause, e.g.
>>>> LET x = SELECT someFunc() AS v1, someOtherFunc() AS v2
>>>> 
>>>> Or just LET (since they are no longer equivalent)
>>>> e.g.
>>>> LET x = (someFunc() AS v1, someOtherFunc() as v2)
>>>> LET (v1, v2) = (someFunc(), someOtherFunc())
>>>> 
>>> 
>>> I see no harm in making FROM optional, as it's recognized by other SQL 
>>> dialects.
>>> 
>>> 
>>> 
>>>> ----
>>>> Also since LET is only binding variables, is there any reason we shouldn’t 
>>>> support multiple SELECT assignments in a single LET?, e.g.
>>>> LET (x, y) = ((SELECT x FROM…), (SELECT y FROM))
>>>> 
>>> 
>>> What if an inner select returns a tuple? Would y be a tuple?
>>> 
>>> 
>>> 
>>> I think this is redundant and atypical enough to not be worth supporting. 
>>> Most people would use separate LETs.
>>> 
>>> 
>>> 
>>>> ----
>>>> Also whether we support tuples in SELECT statements anyway, e.g.
>>>> LET (tuple1, tuple2) = SELECT (a, b), (c, d) FROM..
>>>> IF tuple1.a > 1 AND tuple2.d > 1…
>>> 
>>> Absolutely, this just flows naturally from having tuples. There's no 
>>> difference between "SELECT (a, b)" and "SELECT a_but_a_is_a_tuple".
>>> 
>>> 
>>> 
>>>> 
>>>> ----
>>>> and whether we support nested deconstruction, e.g.
>>>> LET (a, b, (c, d)) = SELECT a, b, someTuple FROM..
>>>> IF a > 1 AND d > 1…
>>>> 
>>> 
>>> I think this can be safely deferred. Most people would again separate it 
>>> into separate LETs.
>>> 
>>> 
>>> 
>>> I'd add (to the specification) that LETs cannot override a previously 
>>> defined variable, just to reduce ambiguity.
>>> 
>>> 
>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>>> On 14 Aug 2022, at 13:55, Avi Kivity via dev <[email protected]> 
>>>>> wrote:
>>>>> 
>>>>> 
>>>>> 
>>>>> On 14/08/2022 01.29, Benedict Elliott Smith wrote:
>>>>>> 
>>>>>> I’ll do my best to express with my thinking, as well as how I would 
>>>>>> explain the feature to a user.
>>>>>> 
>>>>>> My mental model for LET statements is that they are simply SELECT 
>>>>>> statements where the columns that are selected become variables 
>>>>>> accessible anywhere in the scope of the transaction. That is to say, you 
>>>>>> should be able to run something like s/LET/SELECT and 
>>>>>> s/([^=]+)=([^,]+)(,|$)/\2 AS \1\3/g on the columns of a LET statement 
>>>>>> and produce a valid SELECT statement, and vice versa. Both should 
>>>>>> perform identically.
>>>>>> 
>>>>>> e.g. 
>>>>>> SELECT pk AS key, v AS value FROM table 
>>>>>> 
>>>>>> => 
>>>>>> LET key = pk, value = v FROM table
>>>>> 
>>>>> "=" is a CQL/SQL operator. Cassandra doesn't support it yet, but SQL 
>>>>> supports selecting comparisons:
>>>>> 
>>>>> 
>>>>> 
>>>>> $ psql
>>>>> psql (14.3)
>>>>> Type "help" for help.
>>>>> 
>>>>> avi=# SELECT 1 = 2, 3 = 3, NULL = NULL;
>>>>>  ?column? | ?column? | ?column? 
>>>>> ----------+----------+----------
>>>>>  f        | t        | 
>>>>> (1 row)
>>>>> 
>>>>> 
>>>>> 
>>>>> Using "=" as a syntactic element in LET would make SELECT and LET 
>>>>> incompatible once comparisons become valid selectors. Unless they become 
>>>>> mandatory (and then you'd write "LET q = a = b" if you wanted to select a 
>>>>> comparison).
>>>>> 
>>>>> 
>>>>> 
>>>>> I personally prefer the nested query syntax:
>>>>> 
>>>>> 
>>>>> 
>>>>>     LET (a, b, c) = (SELECT foo, bar, x+y FROM ...);
>>>>> 
>>>>> 
>>>>> 
>>>>> So there aren't two similar-but-not-quite-the-same syntaxes. SELECT is 
>>>>> immediately recognizable by everyone as a query, LET is not.
>>>>> 
>>>>> 
>>>>> 
>>>>>> 
>>>>>> Identical form, identical behaviour. Every statement should be directly 
>>>>>> translatable with some simple text manipulation.
>>>>>> 
>>>>>> We can then make this more powerful for users by simply expanding SELECT 
>>>>>> statements, e.g. by permitting them to declare constants and tuples in 
>>>>>> the column results. In this scheme LET x = * is simply syntactic sugar 
>>>>>> for LET x = (pk, ck, field1, …) This scheme then supports options 2, 4 
>>>>>> and 5 all at once, consistently alongside each other.
>>>>>> 
>>>>>> Option 6 is in fact very similar, but is strictly less flexible for the 
>>>>>> user as they have no way to declare multiple scalar variables without 
>>>>>> scoping them inside a tuple.
>>>>>> 
>>>>>> e.g.
>>>>>> LET key = pk, value = v FROM table
>>>>>> IF key > 1 AND value > 1 THEN...
>>>>>> 
>>>>>> =>
>>>>>> LET row = SELECT pk AS key, v AS value FROM table
>>>>>> IF row.key > 1 AND row.value > 1 THEN…
>>>>>> 
>>>>>> However, both are expressible in the existing proposal, as if you prefer 
>>>>>> this naming scheme you can simply write
>>>>>> 
>>>>>> LET row = (pk AS key, v AS value) FROM table
>>>>>> IF row.key > 1 AND row.value > 1 THEN…
>>>>>> 
>>>>>> With respect to auto converting single column results to a scalar, we do 
>>>>>> need a way for the user to say they care whether the row was null or the 
>>>>>> column. I think an implicit conversion here could be surprising. However 
>>>>>> we could implement tuple expressions anyway and let the user explicitly 
>>>>>> declare v as a tuple as Caleb has suggested for the existing proposal as 
>>>>>> well.
>>>>>> 
>>>>>> Assigning constants or other values not selected from a table would also 
>>>>>> be a little clunky:
>>>>>> 
>>>>>> LET v1 = someFunc(), v2 = someOtherFunc(?)
>>>>>> IF v1 > 1 AND v2 > 1 THEN…
>>>>>> 
>>>>>> =>
>>>>>> LET row = SELECT someFunc() AS v1, someOtherFunc(?) AS v2
>>>>>> IF row.v1 > 1 AND row.v2 > 1 THEN...
>>>>>> 
>>>>>> That said, the proposals are close to identical, it is just slightly 
>>>>>> more verbose and slightly less flexible.
>>>>>> 
>>>>>> Which one would be most intuitive to users is hard to predict. It might 
>>>>>> be that Option 6 would be slightly easier, but I’m unsure if there would 
>>>>>> be a huge difference.
>>>>>> 
>>>>>> 
>>>>>>> On 13 Aug 2022, at 16:59, Patrick McFadin <[email protected]> wrote:
>>>>>>> 
>>>>>>> I'm really happy to see CEP-15 getting closer to a final 
>>>>>>> implementation. I'm going to walk through my reasoning for your 
>>>>>>> proposals wrt trying to explain this to somebody new. 
>>>>>>> 
>>>>>>> Looking at all the options, the first thing that comes up for me is the 
>>>>>>> Cassandra project's complicated relationship with NULL.  We have prior 
>>>>>>> art with EXISTS/NOT EXISTS when creating new tables. IS NULL/IS NOT 
>>>>>>> NULL is used in materialized views similarly to proposals 2,4 and 5. 
>>>>>>> 
>>>>>>> CREATE MATERIALIZED VIEW [ IF NOT EXISTS ] [keyspace_name.]view_name
>>>>>>>   AS SELECT [ (column_list) ]
>>>>>>>   FROM [keyspace_name.]table_name
>>>>>>>   [ WHERE column_name IS NOT NULL
>>>>>>>   [ AND column_name IS NOT NULL ... ] ]
>>>>>>>   [ AND relation [ AND ... ] ] 
>>>>>>>   PRIMARY KEY ( column_list )
>>>>>>>   [ WITH [ table_properties ]
>>>>>>>   [ [ AND ] CLUSTERING ORDER BY (cluster_column_name order_option) ] ] ;
>>>>>>> 
>>>>>>>  Based on that, I believe 1 and 3 would just confuse users, so -1 on 
>>>>>>> those. 
>>>>>>> 
>>>>>>> Trying to explain the difference between row and column operations with 
>>>>>>> LET, I can't see the difference between a row and column in #2. 
>>>>>>> 
>>>>>>> #4 introduces a boolean instead of column names and just adds more 
>>>>>>> syntax.
>>>>>>> 
>>>>>>> #5 is verbose and, in my opinion, easier to reason when writing a 
>>>>>>> query. Thinking top down, I need to know if these exact rows and/or 
>>>>>>> column values exist before changing them, so I'll define them first. 
>>>>>>> Then I'll                                       iterate over the state 
>>>>>>> I created in my actual changes so I know I'm changing precisely what I 
>>>>>>> want. 
>>>>>>> 
>>>>>>> #5 could use a bit more to be clearer to somebody who doesn't write CQL 
>>>>>>> queries daily and wouldn't require memorizing subtle differences. It 
>>>>>>> should be similar to all the other syntax, so learning a little about 
>>>>>>> CQL will let you move into more without completely re-learning the new 
>>>>>>> syntax.  
>>>>>>> 
>>>>>>> So I propose #6)
>>>>>>> BEGIN TRANSACTION
>>>>>>>   LET row1 = SELECT * FROM ks.tbl WHERE k=0 AND c=0; <-- * selects all 
>>>>>>> columns
>>>>>>>   LET row2 = SELECT v FROM ks.tbl WHERE k=1 AND c=0;
>>>>>>>   SELECT row1, row2
>>>>>>>   IF row1 IS NULL AND row2.v = 3 THEN
>>>>>>>     INSERT INTO ks.tbl (k, c, v) VALUES (0, 0, 1);
>>>>>>>   END IF
>>>>>>> COMMIT TRANSACTION
>>>>>>> 
>>>>>>> I added the SELECT in the LET just so it's straightforward, you are 
>>>>>>> reading, and it's just like doing a regular select, but you are 
>>>>>>> assigning it to a variable. 
>>>>>>> 
>>>>>>> I removed the confusing 'row1.v' and replaced it with 'row1' I can't 
>>>>>>> see why you would need the '.v' vs having the complete variable I 
>>>>>>> created in the statement above. 
>>>>>>> 
>>>>>>> EOL
>>>>>>> 
>>>>>>> Patrick
>>>>>>> 
>>>>>>> On Thu, Aug 11, 2022 at 1:37 PM Caleb Rackliffe 
>>>>>>> <[email protected]> wrote:
>>>>>>>> ...and one more option...
>>>>>>>> 
>>>>>>>> 5.) Introduce tuple assignments, removing all ambiguity around row vs. 
>>>>>>>> column operations.
>>>>>>>> 
>>>>>>>> BEGIN TRANSACTION
>>>>>>>>   LET row1 = * FROM ks.tbl WHERE k=0 AND c=0; <-- * selects all columns
>>>>>>>>   LET row2 = (v) FROM ks.tbl WHERE k=1 AND c=0;
>>>>>>>>   SELECT row1.v, row2.v
>>>>>>>>   IF row1 IS NULL AND row2.v = 3 THEN
>>>>>>>>     INSERT INTO ks.tbl (k, c, v) VALUES (0, 0, 1);
>>>>>>>>   END IF
>>>>>>>> COMMIT TRANSACTION
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> On Thu, Aug 11, 2022 at 12:55 PM Caleb Rackliffe 
>>>>>>>> <[email protected]> wrote:
>>>>>>>>> via Benedict, here is a 4th option:
>>>>>>>>> 
>>>>>>>>> 4.) Similar to #2, but don't rely on the key element being NULL.
>>>>>>>>> 
>>>>>>>>> If the read returns no result, x effectively becomes NULL. Otherwise, 
>>>>>>>>> it remains true/NOT NULL.
>>>>>>>>> 
>>>>>>>>> BEGIN TRANSACTION
>>>>>>>>>   LET x = true FROM ks.tbl WHERE k=0 AND c=0;
>>>>>>>>>   LET row2_v = v FROM ks.tbl WHERE k=1 AND c=0;
>>>>>>>>>   SELECT x, row2_v
>>>>>>>>>   IF x IS NULL AND row2_v = 3 THEN
>>>>>>>>>     INSERT INTO ks.tbl (k, c, v) VALUES (0, 0, 1);
>>>>>>>>>   END IF
>>>>>>>>> COMMIT TRANSACTION
>>>>>>>>> 
>>>>>>>>> On Thu, Aug 11, 2022 at 12:12 PM Caleb Rackliffe 
>>>>>>>>> <[email protected]> wrote:
>>>>>>>>>> Hello again everyone!
>>>>>>>>>> 
>>>>>>>>>> I've been working on a prototype in CASSANDRA-17719 for a grammar 
>>>>>>>>>> that roughly corresponds to what we've agreed on in this thread. One 
>>>>>>>>>> thing that isn't immediately obvious to me is how the LET syntax 
>>>>>>>>>> handles cases where we want to check for the plain existence of a 
>>>>>>>>>> row in IF. For example, in this hybrid of the originally proposed 
>>>>>>>>>> syntax and something more like what we've agreed on (and the         
>>>>>>>>>>                                               RETURNING just to 
>>>>>>>>>> distinguish between that and SELECT), this could be pretty 
>>>>>>>>>> straightforward:
>>>>>>>>>> 
>>>>>>>>>> BEGIN TRANSACTION
>>>>>>>>>>   SELECT v FROM ks.tbl WHERE k=0 AND c=0 AS row1;
>>>>>>>>>>   SELECT v FROM ks.tbl WHERE k=1 AND c=0 AS row2;
>>>>>>>>>>   RETURNING row1.v, row2.v
>>>>>>>>>>   IF row1 NOT EXISTS AND row2.v = 3 THEN
>>>>>>>>>>     INSERT INTO ks.tbl (k, c, v) VALUES (0, 0, 1);
>>>>>>>>>>   END IF
>>>>>>>>>> COMMIT TRANSACTION
>>>>>>>>>> 
>>>>>>>>>> The NOT EXISTS operator has row1 to work with. One the other hand, 
>>>>>>>>>> w/ the LET syntax and no naming of reads, it's not clear what the 
>>>>>>>>>> best solution would be. Here are a few possibilities:
>>>>>>>>>> 
>>>>>>>>>> 1.) Provide a few built-in functions that operate on a               
>>>>>>>>>>                                         whole result row. If we 
>>>>>>>>>> assume a SQL style IS NULL and IS NOT NULL (see my last post here) 
>>>>>>>>>> for operations on particular columns, this probably eliminates the 
>>>>>>>>>> need for EXISTS/NOT EXISTS as well.
>>>>>>>>>> 
>>>>>>>>>> BEGIN TRANSACTION
>>>>>>>>>>   LET row1_missing = notExists() FROM ks.tbl WHERE k=0 AND c=0;
>>>>>>>>>>   LET row2_v = v FROM ks.tbl WHERE k=1 AND c=0;
>>>>>>>>>>   SELECT row1_missing, row2_v
>>>>>>>>>>   IF row1_missing AND row2_v = 3 THEN
>>>>>>>>>>     INSERT INTO ks.tbl (k, c, v) VALUES (0, 0, 1);
>>>>>>>>>>   END IF
>>>>>>>>>> COMMIT TRANSACTION
>>>>>>>>>> 
>>>>>>>>>> 2.) Assign and check the first primary key element to determine 
>>>>>>>>>> whether the row exists.
>>>>>>>>>> 
>>>>>>>>>> BEGIN TRANSACTION
>>>>>>>>>>   LET row1_k = k FROM ks.tbl WHERE k=0 AND c=0;
>>>>>>>>>>   LET row2_v = v FROM ks.tbl WHERE k=1 AND c=0;
>>>>>>>>>>   SELECT row1_k, row2_v
>>>>>>>>>>   IF row1_k IS NULL AND row2_v = 3 THEN
>>>>>>>>>>     INSERT INTO ks.tbl (k, c, v) VALUES (0, 0, 1);
>>>>>>>>>>   END IF
>>>>>>>>>> COMMIT TRANSACTION
>>>>>>>>>> 
>>>>>>>>>> 3.) Reconsider the LET concept toward something that allows us to 
>>>>>>>>>> explicitly name our reads again.
>>>>>>>>>> 
>>>>>>>>>> BEGIN TRANSACTION
>>>>>>>>>>   WITH (SELECT v FROM ks.tbl WHERE k=0 AND c=0) AS row1;
>>>>>>>>>>   WITH (SELECT v FROM ks.tbl WHERE k=1 AND c=0) AS row2;
>>>>>>>>>>   SELECT row1.v, row2.v
>>>>>>>>>>   IF row1 NOT EXISTS AND row2.v = 3 THEN
>>>>>>>>>>     INSERT INTO ks.tbl (k, c, v) VALUES (0, 0, 1);
>>>>>>>>>>   END IF
>>>>>>>>>> COMMIT TRANSACTION
>>>>>>>>>> 
>>>>>>>>>> I don't have a strong affinity for any of these, although #1 seems 
>>>>>>>>>> the most awkward.
>>>>>>>>>> 
>>>>>>>>>> Does anyone have any other alternatives? Preference for one of the 
>>>>>>>>>> above options?
>>>>>>>>>> 
>>>>>>>>>> Thanks!
>>>>>>>>>> 
>>>>>>>>>> On Fri, Jul 22, 2022 at 11:21 AM Caleb Rackliffe 
>>>>>>>>>> <[email protected]> wrote:
>>>>>>>>>>> Avi brought up an interesting point around NULLness checking in 
>>>>>>>>>>> CASSANDRA-17762...
>>>>>>>>>>> 
>>>>>>>>>>>> In SQL, any comparison with NULL is NULL, which is interpreted as 
>>>>>>>>>>>> FALSE in a condition. To test for NULLness, you use IS NULL or IS 
>>>>>>>>>>>> NOT NULL. But LWT uses IF col = NULL as a NULLness test. This is 
>>>>>>>>>>>> likely to confuse people coming from SQL and hamper attempts to 
>>>>>>>>>>>> extend the dialect.
>>>>>>>>>>> 
>>>>>>>>>>> We can leave that Jira open to address what to do in the legacy LWT 
>>>>>>>>>>> case, but I'd support a SQL-congruent syntax here (IS NULL or IS 
>>>>>>>>>>> NOT NULL), where we have something closer to a blank slate.
>>>>>>>>>>> 
>>>>>>>>>>> Thoughts?
>>>>>>>>>>> 
>>>>>>>>>>> On Thu, Jun 30, 2022 at 6:25 PM Abe Ratnofsky <[email protected]> wrote:
>>>>>>>>>>>> The new syntax looks great, and I’m really excited to see this 
>>>>>>>>>>>> coming together.
>>>>>>>>>>>> 
>>>>>>>>>>>> One piece of feedback on the proposed syntax is around the use of 
>>>>>>>>>>>> “=“ as a declaration in addition to its current use as an equality 
>>>>>>>>>>>> operator in a WHERE clause and an assignment operator in an UPDATE:
>>>>>>>>>>>> 
>>>>>>>>>>>> BEGIN TRANSACTION
>>>>>>>>>>>>   LET car_miles = miles_driven, car_is_running = is_running FROM 
>>>>>>>>>>>> cars WHERE model=’pinto’
>>>>>>>>>>>>   LET user_miles = miles_driven FROM users WHERE name=’blake’
>>>>>>>>>>>>   SELECT something else from some other table
>>>>>>>>>>>>   IF NOT car_is_running THEN ABORT
>>>>>>>>>>>>   UPDATE users SET miles_driven = user_miles + 30 WHERE 
>>>>>>>>>>>> name='blake';
>>>>>>>>>>>>   UPDATE cars SET miles_driven = car_miles + 30 WHERE 
>>>>>>>>>>>> model='pinto';
>>>>>>>>>>>> COMMIT TRANSACTION
>>>>>>>>>>>> 
>>>>>>>>>>>> This is supported in languages like PL/pgSQL, but in a normal SQL 
>>>>>>>>>>>> query kind of local declaration is often expressed as an alias 
>>>>>>>>>>>> (SELECT col AS new_col), subquery alias (SELECT col) t, or common 
>>>>>>>>>>>> table expression (WITH t AS (SELECT col)).
>>>>>>>>>>>> 
>>>>>>>>>>>> Here’s an example of an alternative to the proposed syntax that 
>>>>>>>>>>>> I’d find more readable:
>>>>>>>>>>>> 
>>>>>>>>>>>> BEGIN TRANSACTION
>>>>>>>>>>>>   WITH car_miles, car_is_running AS (SELECT miles_driven, 
>>>>>>>>>>>> is_running FROM cars WHERE model=’pinto’),
>>>>>>>>>>>>    user_miles AS (SELECT miles_driven FROM users WHERE 
>>>>>>>>>>>> name=’blake’)
>>>>>>>>>>>>   IF NOT car_is_running THEN ABORT
>>>>>>>>>>>>   UPDATE users SET miles_driven = user_miles + 30 WHERE 
>>>>>>>>>>>> name='blake';
>>>>>>>>>>>>   UPDATE cars SET miles_driven = car_miles + 30 WHERE 
>>>>>>>>>>>> model='pinto';
>>>>>>>>>>>> COMMIT TRANSACTION
>>>>>>>>>>>> 
>>>>>>>>>>>> There’s also the option of naming the transaction like a subquery, 
>>>>>>>>>>>> and supporting LET via AS (this one I’m less sure about but wanted 
>>>>>>>>>>>> to propose anyway):
>>>>>>>>>>>> 
>>>>>>>>>>>> BEGIN TRANSACTION t1
>>>>>>>>>>>>   SELECT miles_driven AS t1.car_miles, is_running AS 
>>>>>>>>>>>> t1.car_is_running FROM cars WHERE model=’pinto’;
>>>>>>>>>>>>   SELECT miles_driven AS t1.user_miles FROM users WHERE 
>>>>>>>>>>>> name=’blake’;
>>>>>>>>>>>>   IF NOT car_is_running THEN ABORT
>>>>>>>>>>>>   UPDATE users SET miles_driven = user_miles + 30 WHERE 
>>>>>>>>>>>> name='blake';
>>>>>>>>>>>>   UPDATE cars SET miles_driven = car_miles + 30 WHERE 
>>>>>>>>>>>> model='pinto';
>>>>>>>>>>>> COMMIT TRANSACTION
>>>>>>>>>>>> 
>>>>>>>>>>>> This also has the benefit of resolving ambiguity in case of naming 
>>>>>>>>>>>> conflicts with existing (or future) column names.
>>>>>>>>>>>> 
>>>>>>>>>>>> --
>>>>>>>>>>>> Abe

Re: CEP-15 multi key transaction syntax

Reply via email to