Re: Review Request : UPDATE statement support in SQL++ for AsterixDB

Abhishek Jindal Wed, 30 Oct 2024 10:31:18 -0700

Update:

Hi Mike,


I just tested the changes. Can confirm that we can accommodate a comma (,) in 
between the changes
In a sequence.

Regards,
Abhishek

On 10/25/24, 11:47 AM, "Mike Carey" <[email protected] 
<mailto:[email protected]>> wrote:


Hmmmm. I'm convinceable but not yet convinced. I don't think that the 
two SET statements with a comma at the end of the first one is really a 
clarity problem - not confusing I don't think - as the second separate 
SET keyword marks the start of a new change in a clear way. So if this 
is the ONLY problem, and if allowing comma as the change separator 
otherwise works, w/o ( )'s, I think that would be (much) cleaner to 
write and read. I do totally agree about not using ";" here, instead 
keeping that for statement termination. (We don't want to have to do 
MySQL-like terminator re-definition to do updates!)


To me the open question now is - if we DID adopt SET and comma (followed 
by UPDATE or INSERT or DELETE) as separators for changes - would the 
grammar work and not be ambiguous?


Can we explore the viability of those changes and see if it's a problem 
if we do use commas?


Cheers,


Mike


On 10/25/24 9:54 AM, Abhishek Jindal wrote:
> I'm dumping my thoughts here; trying to use process of elimination to reach a 
> consensus on the final grammar:
>
> * Initially thought about using semi-colon (;) for separating change 
> productions.
> However, semi-colon should only be used to indicate end-of-statement. This 
> leaves us with
> Using either comma (',') or closed parenthesis ')' to mark end of change 
> production.
>
> * We cannot enforce a rule stating - All change productions need to be 
> enclosed within open-close
> parenthesis pair. This will make the UPDATE statement incompatible with 
> traditional SQL.
> In traditional SQL a SetClause is not enclosed within parenthesis and we want 
> our query parser to accept
> such a statement.
>
> * However, assuming no parenthesis in any change production, the presence of 
> comma (,) can lead to ambiguous nature.
> Example:
>
> UPDATE sampleAnalytics.Commerce.customers as c
> SET c.name = "Abhishek", c.rating = c.rating + 500,
> SET c.is_blacklisted = (c.rating < 200)
>
> This statement is very confusing to read - The purpose of 2nd comma in the 
> statement is unclear to the user.
>
> * To eliminate this ambiguous nature of a comma, I believe this character 
> should be left out for its sole use of separating SetElements
> in a SetClause.
>
> * We also cannot simply start with a new change production without a 
> delimiter. This will cause issues with the parser misidentifying
> recursive change productions as a new change. For example:
>
> UPDATE sampleAnalytics.Commerce.orders as o
> UPDATE o.items as item
> SET item.total = item.qty * item.price
> WHERE o.orderno = 1006;
>
> If we are not using any delimiter to mark end-of-change, the parser will 
> wrongly identify 2 change productions here:
> 1. UPDATE o.items as item
> 2. SET item.total = item.qty * item.price
>
> This obviously is incorrect.
>
> * I believe this leaves us with only 1 solution (comments are welcome __ ):
> Always enclose a change-production within parenthesis '(' and ')'.
> This obviously is not valid if the first change in the production list is a 
> SET clause - This is for backward compatibility with SQL.
>
> ----
>
> +1 to the idea of 'AT INDEX' in INSERT INTO and DELETE FROM nested clauses.
>
> Regards,
>
> Abhishek
>
>
>
> On 10/22/24, 3:40 PM, "Mike Carey" <[email protected] 
> <mailto:[email protected]> <mailto:[email protected] 
> <mailto:[email protected]>>> wrote:
>
>
> Thanks! Agreed about the AT issue - we should add an optional INDEX
> qualifier there - probably even in the FROM clause's AT clause option
> (which already exists minus the keywork option).
>
>
> We should look into adding commas - that seems like a good change if the
> grammar is amenable to that (which I think it would be). Making the
> parentheses optional might not work - not sure - we can explore that -
> I'm not 100% sure we could do that w/o introducing ambiguity about where
> things start and stop. (But if we can I'd love to ditch the parentheses
> - that was a conservative approach to the potential problem that will
> surely work.)
>
>
> Cheers,
>
>
> Mike
>
>
> On 10/20/24 12:19 PM, Glenn Galvizo wrote:
>> +1 Very much a needed feature!
>>
>> - For inserting / modifying items at arrays, i think it might help to have 
>> another token after the ‘AT’ to denote that this is a position (it might 
>> just be me, but ‘AT 1’ seems a little too vague). Maybe ‘AT INDEX 1’? (given 
>> that INDEX is already a reserved word?)
>> - The Change production seems like it should be separated with a comma (to 
>> really hammer in the point that this is a sequence) or even a semicolon if 
>> we want to make this more PL/SQL-like. It could also an opportunity to make 
>> the parenthesis optional, if you want to go down that route.
>>
>> Other than those two minor things, I like it!
>>
>> Best,
>> Glenn
>>
>>> On Oct 20, 2024, at 10:33, Mike Carey<[email protected] 
>>> <mailto:[email protected]> <mailto:[email protected] 
>>> <mailto:[email protected]>>> wrote:
>>>
>>> +1 for this (obviously, since I am on it). FYI, we have also run our UPDATE 
>>> user model and syntax by Yannis P (father of SQL++) and Don C (father of 
>>> SQL) for their input prior to posting this APE. :-) We've needed this 
>>> feature for quite some time in order to conveniently express small(-ish) 
>>> changes to arbitrary (possibly large) schema-less documents.
>>>
>>> Discussion welcome!
>>>
>>> Cheers,
>>>
>>> Mike
>>>
>>>> On 10/18/24 3:18 PM, Abhishek Jindal wrote:
>>>> Hi All,
>>>>
>>>> I'm initiating a discussion thread proposing the SQL++ UPDATE statement in 
>>>> AsterixDB.
>>>> *Feature:* Adding support for SQL++ UPDATE statement.
>>>> *Details:* AsterixDB currently does not support UPDATE operations without 
>>>> having
>>>> to pass an entire new object to replace an existing record in a collection.
>>>> The following proposal discusses syntax and semantics of the UPDATE 
>>>> statement as part of
>>>> SQL++ for AsterixDB.
>>>>
>>>> We plan to implement this feature by rewriting the UPDATE statement into 
>>>> its equivalent
>>>> UPSERT form, allowing us to reuse the existing LSM-tree UPSERT machinery 
>>>> to handle the transformed incoming record.
>>>>
>>>> To apply transformations to an incoming record, we employ the following 
>>>> approach:
>>>>
>>>> 1. We recursively traverse the hierarchy of transformations as specified 
>>>> by the user in the query.
>>>> 2. At each hierarchical level, we rewrite the transformation to the 
>>>> equivalent record-merge() built-in function.
>>>> 3. These rewritten record-merge() transformations are then combined in a 
>>>> bottom-up manner, finally producing the final transformation function.
>>>>
>>>> APE 
>>>> :https://cwiki.apache.org/confluence/display/ASTERIXDB/APE+9%3A+UPDATE+Statement
>>>>  
>>>> <https://cwiki.apache.org/confluence/display/ASTERIXDB/APE+9%3A+UPDATE+Statement>
>>>>  
>>>> <https://cwiki.apache.org/confluence/display/ASTERIXDB/APE+9%3A+UPDATE+Statement>
>>>>  
>>>> <https://cwiki.apache.org/confluence/display/ASTERIXDB/APE+9%3A+UPDATE+Statement&gt;>

Re: Review Request : UPDATE statement support in SQL++ for AsterixDB

Reply via email to