The replies got trashed pretty badly in the responses. When you say: "Agree it's better to reuse existing syntax than invent new syntax."
Which syntax are you referring to? Patrick On Mon, Aug 22, 2022 at 1:36 AM Avi Kivity via dev <dev@cassandra.apache.org> wrote: > Agree it's better to reuse existing syntax than invent new syntax. > > On 8/21/22 16:52, Konstantin Osipov wrote: > > * Avi Kivity via dev <dev@cassandra.apache.org> [22/08/14 15:59]: > > > > MySQL supports SELECT <expr_list> INTO <var_list> FROM ... WHERE > > ... > > > > PostgreSQL supports pretty much the same syntax. > > > > Maybe instead of LET use the ANSI/MySQL/PostgreSQL DECLARE var TYPE and > > MySQL/PostgreSQL SELECT ... INTO? > > > >> On 14/08/2022 01.29, Benedict Elliott Smith wrote: > >>> > >>> I’ll do my best to express with my thinking, as well as how I would > >>> explain the feature to a user. > >>> > >>> My mental model for LET statements is that they are simply SELECT > >>> statements where the columns that are selected become variables > >>> accessible anywhere in the scope of the transaction. That is to say, > you > >>> should be able to run something like s/LET/SELECT and > >>> s/([^=]+)=([^,]+)(,|$)/\2 AS \1\3/g on the columns of a LET statement > >>> and produce a valid SELECT statement, and vice versa. Both should > >>> perform identically. > >>> > >>> e.g. > >>> SELECT pk AS key, v AS value FROM table > >>> > >>> => > >>> LET key = pk, value = v FROM table > >> > >> "=" is a CQL/SQL operator. Cassandra doesn't support it yet, but SQL > >> supports selecting comparisons: > >> > >> > >> $ psql > >> psql (14.3) > >> Type "help" for help. > >> > >> avi=# SELECT 1 = 2, 3 = 3, NULL = NULL; > >> ?column? | ?column? | ?column? > >> ----------+----------+---------- > >> f | t | > >> (1 row) > >> > >> > >> Using "=" as a syntactic element in LET would make SELECT and LET > >> incompatible once comparisons become valid selectors. Unless they become > >> mandatory (and then you'd write "LET q = a = b" if you wanted to select > a > >> comparison). > >> > >> > >> I personally prefer the nested query syntax: > >> > >> > >> LET (a, b, c) = (SELECT foo, bar, x+y FROM ...); > >> > >> > >> So there aren't two similar-but-not-quite-the-same syntaxes. SELECT is > >> immediately recognizable by everyone as a query, LET is not. > >> > >> > >>> Identical form, identical behaviour. Every statement should be directly > >>> translatable with some simple text manipulation. > >>> > >>> We can then make this more powerful for users by simply expanding > SELECT > >>> statements, e.g. by permitting them to declare constants and tuples in > >>> the column results. In this scheme LET x = * is simply syntactic sugar > >>> for LET x = (pk, ck, field1, …) This scheme then supports options 2, 4 > >>> and 5 all at once, consistently alongside each other. > >>> > >>> Option 6 is in fact very similar, but is strictly less flexible for the > >>> user as they have no way to declare multiple scalar variables without > >>> scoping them inside a tuple. > >>> > >>> e.g. > >>> LET key = pk, value = v FROM table > >>> IF key > 1 AND value > 1 THEN... > >>> > >>> => > >>> LET row = SELECT pk AS key, v AS value FROM table > >>> IF row.key > 1 AND row.value > 1 THEN… > >>> > >>> However, both are expressible in the existing proposal, as if you > prefer > >>> this naming scheme you can simply write > >>> > >>> LET row = (pk AS key, v AS value) FROM table > >>> IF row.key > 1 AND row.value > 1 THEN… > >>> > >>> With respect to auto converting single column results to a scalar, we > do > >>> need a way for the user to say they care whether the row was null or > the > >>> column. I think an implicit conversion here could be surprising. > However > >>> we could implement tuple expressions anyway and let the user explicitly > >>> declare v as a tuple as Caleb has suggested for the existing proposal > as > >>> well. > >>> > >>> Assigning constants or other values not selected from a table would > also > >>> be a little clunky: > >>> > >>> LET v1 = someFunc(), v2 = someOtherFunc(?) > >>> IF v1 > 1 AND v2 > 1 THEN… > >>> > >>> => > >>> LET row = SELECT someFunc() AS v1, someOtherFunc(?) AS v2 > >>> IF row.v1 > 1 AND row.v2 > 1 THEN... > >>> > >>> That said, the proposals are /close/ to identical, it is just slightly > >>> more verbose and slightly less flexible. > >>> > >>> Which one would be most intuitive to users is hard to predict. It might > >>> be that Option 6 would be slightly easier, but I’m unsure if there > would > >>> be a huge difference. > >>> > >>> > >>>> On 13 Aug 2022, at 16:59, Patrick McFadin <pmcfa...@gmail.com> wrote: > >>>> > >>>> I'm really happy to see CEP-15 getting closer to a final > >>>> implementation. I'm going to walk through my reasoning for your > >>>> proposals wrt trying to explain this to somebody new. > >>>> > >>>> Looking at all the options, the first thing that comes up for me is > >>>> the Cassandra project's complicated relationship with NULL. We have > >>>> prior art with EXISTS/NOT EXISTS when creating new tables. IS > >>>> NULL/IS NOT NULL is used in materialized views similarly to > >>>> proposals 2,4 and 5. > >>>> > >>>> CREATE MATERIALIZED VIEW [ IF NOT EXISTS ] [keyspace_name.]view_name > >>>> AS SELECT [ (column_list) ] > >>>> FROM [keyspace_name.]table_name > >>>> [ WHERE column_name IS NOT NULL > >>>> [ AND column_name IS NOT NULL ... ] ] > >>>> [ AND relation [ AND ... ] ] > >>>> PRIMARY KEY ( column_list ) > >>>> [ WITH [ table_properties ] > >>>> [ [ AND ] CLUSTERING ORDER BY (cluster_column_name order_option) ] > ] ; > >>>> > >>>> Based on that, I believe 1 and 3 would just confuse users, so -1 on > >>>> those. > >>>> > >>>> Trying to explain the difference between row and column operations > >>>> with LET, I can't see the difference between a row and column in #2. > >>>> > >>>> #4 introduces a boolean instead of column names and just adds more > >>>> syntax. > >>>> > >>>> #5 is verbose and, in my opinion, easier to reason when writing a > >>>> query. Thinking top down, I need to know if these exact rows and/or > >>>> column values exist before changing them, so I'll define them first. > >>>> Then I'll iterate over the state I created in my actual changes so I > >>>> know I'm changing precisely what I want. > >>>> > >>>> #5 could use a bit more to be clearer to somebody who doesn't write > >>>> CQL queries daily and wouldn't require memorizing subtle > >>>> differences. It should be similar to all the other syntax, so > >>>> learning a little about CQL will let you move into more without > >>>> completely re-learning the new syntax. > >>>> > >>>> So I propose #6) > >>>> BEGIN TRANSACTION > >>>> LET row1 = SELECT * FROM ks.tbl WHERE k=0 AND c=0; <-- * selects all > >>>> columns > >>>> LET row2 = SELECT v FROM ks.tbl WHERE k=1 AND c=0; > >>>> SELECT row1, row2 > >>>> IF row1 IS NULL AND row2.v = 3 THEN > >>>> INSERT INTO ks.tbl (k, c, v) VALUES (0, 0, 1); > >>>> END IF > >>>> COMMIT TRANSACTION > >>>> > >>>> I added the SELECT in the LET just so it's straightforward, you are > >>>> reading, and it's just like doing a regular select, but you are > >>>> assigning it to a variable. > >>>> > >>>> I removed the confusing 'row1.v'and replaced it with 'row1'I can't > >>>> see why you would need the '.v'vs having the complete variable I > >>>> created in the statement above. > >>>> > >>>> EOL > >>>> > >>>> Patrick > >>>> > >>>> On Thu, Aug 11, 2022 at 1:37 PM Caleb Rackliffe > >>>> <calebrackli...@gmail.com> wrote: > >>>> > >>>> ...and one more option... > >>>> > >>>> 5.) Introduce tuple assignments, removing all ambiguity around > >>>> row vs. column operations. > >>>> > >>>> BEGIN TRANSACTION > >>>> LET row1 = * FROM ks.tbl WHERE k=0 AND c=0; <-- * selects all > >>>> columns > >>>> LET row2 = (v) FROM ks.tbl WHERE k=1 AND c=0; > >>>> SELECT row1.v, row2.v > >>>> IF row1 IS NULL AND row2.v = 3 THEN > >>>> INSERT INTO ks.tbl (k, c, v) VALUES (0, 0, 1); > >>>> END IF > >>>> COMMIT TRANSACTION > >>>> > >>>> > >>>> > >>>> On Thu, Aug 11, 2022 at 12:55 PM Caleb Rackliffe > >>>> <calebrackli...@gmail.com> wrote: > >>>> > >>>> via Benedict, here is a 4th option: > >>>> > >>>> 4.) Similar to #2, but don't rely on the key element being > NULL. > >>>> > >>>> If the read returns no result, x effectively becomes NULL. > >>>> Otherwise, it remains true/NOT NULL. > >>>> > >>>> BEGIN TRANSACTION > >>>> LET x = true FROM ks.tbl WHERE k=0 AND c=0; > >>>> LET row2_v = v FROM ks.tbl WHERE k=1 AND c=0; > >>>> SELECT x, row2_v > >>>> IF x IS NULL AND row2_v = 3 THEN > >>>> INSERT INTO ks.tbl (k, c, v) VALUES (0, 0, 1); > >>>> END IF > >>>> COMMIT TRANSACTION > >>>> > >>>> On Thu, Aug 11, 2022 at 12:12 PM Caleb Rackliffe > >>>> <calebrackli...@gmail.com> wrote: > >>>> > >>>> Hello again everyone! > >>>> > >>>> I've been working on a prototype > >>>> <https://issues.apache.org/jira/browse/CASSANDRA-17719 > > in > >>>> CASSANDRA-17719 for a grammar that roughly corresponds to > >>>> what we've agreed on in this thread. One thing that isn't > >>>> immediately obvious to me is how the LET syntax handles > >>>> cases where we want to check for the plain existence of a > >>>> row in IF. For example, in this hybrid of the originally > >>>> proposed syntax and something more like what we've agreed > >>>> on (and the RETURNING just to distinguish between that > >>>> and SELECT), this could be pretty straightforward: > >>>> > >>>> BEGIN TRANSACTION > >>>> SELECT v FROM ks.tbl WHERE k=0 AND c=0 AS row1; > >>>> SELECT v FROM ks.tbl WHERE k=1 AND c=0 AS row2; > >>>> RETURNING row1.v, row2.v > >>>> IF row1 NOT EXISTS AND row2.v = 3 THEN > >>>> INSERT INTO ks.tbl (k, c, v) VALUES (0, 0, 1); > >>>> END IF > >>>> COMMIT TRANSACTION > >>>> > >>>> The NOT EXISTS operator has row1 to work with. One the > >>>> other hand, w/ the LET syntax and no naming of reads, > >>>> it's not clear what the best solution would be. Here are > >>>> a few possibilities: > >>>> > >>>> 1.) Provide a few built-in functions that operate on a > >>>> whole result row. If we assume a SQL style IS NULL and IS > >>>> NOT NULL (see my last post here) for operations on > >>>> particular columns, this probably eliminates the need for > >>>> EXISTS/NOT EXISTS as well. > >>>> > >>>> BEGIN TRANSACTION > >>>> LET row1_missing = notExists() FROM ks.tbl WHERE k=0 > >>>> AND c=0; > >>>> LET row2_v = v FROM ks.tbl WHERE k=1 AND c=0; > >>>> SELECT row1_missing, row2_v > >>>> IF row1_missing AND row2_v = 3 THEN > >>>> INSERT INTO ks.tbl (k, c, v) VALUES (0, 0, 1); > >>>> END IF > >>>> COMMIT TRANSACTION > >>>> > >>>> 2.) Assign and check the first primary key element to > >>>> determine whether the row exists. > >>>> > >>>> BEGIN TRANSACTION > >>>> LET row1_k = k FROM ks.tbl WHERE k=0 AND c=0; > >>>> LET row2_v = v FROM ks.tbl WHERE k=1 AND c=0; > >>>> SELECT row1_k, row2_v > >>>> IF row1_k IS NULL AND row2_v = 3 THEN > >>>> INSERT INTO ks.tbl (k, c, v) VALUES (0, 0, 1); > >>>> END IF > >>>> COMMIT TRANSACTION > >>>> > >>>> 3.) Reconsider the LET concept toward something that > >>>> allows us to explicitly name our reads again. > >>>> > >>>> BEGIN TRANSACTION > >>>> WITH (SELECT v FROM ks.tbl WHERE k=0 AND c=0) AS row1; > >>>> WITH (SELECT v FROM ks.tbl WHERE k=1 AND c=0) AS row2; > >>>> SELECT row1.v, row2.v > >>>> IF row1 NOT EXISTS AND row2.v = 3 THEN > >>>> INSERT INTO ks.tbl (k, c, v) VALUES (0, 0, 1); > >>>> END IF > >>>> COMMIT TRANSACTION > >>>> > >>>> I don't have a strong affinity for any of these, although > >>>> #1 seems the most awkward. > >>>> > >>>> Does anyone have any other alternatives? Preference for > >>>> one of the above options? > >>>> > >>>> Thanks! > >>>> > >>>> On Fri, Jul 22, 2022 at 11:21 AM Caleb Rackliffe > >>>> <calebrackli...@gmail.com> wrote: > >>>> > >>>> Avi brought up an interesting point around NULLness > >>>> checking inCASSANDRA-17762 > >>>> < > https://issues.apache.org/jira/browse/CASSANDRA-17762>... > >>>> > >>>> In SQL, any comparison with NULL is NULL, which > >>>> is interpreted as FALSE in a condition. To test > >>>> for NULLness, you use IS NULL or IS NOT NULL. But > >>>> LWT uses IF col = NULL as a NULLness test. This > >>>> is likely to confuse people coming from SQL and > >>>> hamper attempts to extend the dialect. > >>>> > >>>> > >>>> We can leave that Jira open to address what to do in > >>>> the legacy LWT case, but I'd support a SQL-congruent > >>>> syntax here (IS NULL or IS NOT NULL), where we have > >>>> something closer to a blank slate. > >>>> > >>>> Thoughts? > >>>> > >>>> On Thu, Jun 30, 2022 at 6:25 PM Abe Ratnofsky > >>>> <a...@aber.io> wrote: > >>>> > >>>> The new syntax looks great, and I’m really > >>>> excited to see this coming together. > >>>> > >>>> One piece of feedback on the proposed syntax is > >>>> around the use of “=“ as a declaration in > >>>> addition to its current use as an equality > >>>> operator in a WHERE clause and an assignment > >>>> operator in an UPDATE: > >>>> > >>>> BEGIN TRANSACTION > >>>> LET car_miles = miles_driven, > >>>> car_is_running = is_running FROM cars WHERE > >>>> model=’pinto’ > >>>> LET user_miles = miles_driven FROM users > >>>> WHERE name=’blake’ > >>>> SELECT something else from some other table > >>>> IF NOT car_is_running THEN ABORT > >>>> UPDATE users SET miles_driven = user_miles > >>>> + 30 WHERE name='blake'; > >>>> UPDATE cars SET miles_driven = car_miles + > >>>> 30 WHERE model='pinto'; > >>>> COMMIT TRANSACTION > >>>> > >>>> This is supported in languages like PL/pgSQL, but > >>>> in a normal SQL query kind of local declaration > >>>> is often expressed as an alias (SELECT col AS > >>>> new_col), subquery alias (SELECT col) t, or > >>>> common table expression (WITH t AS (SELECT col)). > >>>> > >>>> Here’s an example of an alternative to the > >>>> proposed syntax that I’d find more readable: > >>>> > >>>> BEGIN TRANSACTION > >>>> WITH car_miles, car_is_running AS (SELECT > >>>> miles_driven, is_running FROM cars WHERE > >>>> model=’pinto’), > >>>> user_miles AS (SELECT miles_driven FROM users > >>>> WHERE name=’blake’) > >>>> IF NOT car_is_running THEN ABORT > >>>> UPDATE users SET miles_driven = user_miles > >>>> + 30 WHERE name='blake'; > >>>> UPDATE cars SET miles_driven = car_miles + 30 > >>>> WHERE model='pinto'; > >>>> COMMIT TRANSACTION > >>>> > >>>> There’s also the option of naming the transaction > >>>> like a subquery, and supporting LET via AS (this > >>>> one I’m less sure about but wanted to propose > >>>> anyway): > >>>> > >>>> BEGIN TRANSACTION t1 > >>>> SELECT miles_driven AS t1.car_miles, > >>>> is_running AS t1.car_is_running FROM cars > >>>> WHERE model=’pinto’; > >>>> SELECT miles_driven AS t1.user_miles FROM > >>>> users WHERE name=’blake’; > >>>> IF NOT car_is_running THEN ABORT > >>>> UPDATE users SET miles_driven = user_miles > >>>> + 30 WHERE name='blake'; > >>>> UPDATE cars SET miles_driven = car_miles + > >>>> 30 WHERE model='pinto'; > >>>> COMMIT TRANSACTION > >>>> > >>>> This also has the benefit of resolving ambiguity > >>>> in case of naming conflicts with existing (or > >>>> future) column names. > >>>> > >>>> -- > >>>> Abe > >>>> > >