On 2023-Dec-05, Amit Langote wrote:

> I've attempted to trim down the JSON_TABLE grammar (0004), but this is
> all I've managed so far.  Among other things, I couldn't refactor the
> grammar to do away with the following:
> 
> +%nonassoc  NESTED
> +%left      PATH

To recap, the reason we're arguing about this is that this creates two
new precedence classes, which are higher than everything else.  Judging
by the discussios in thread [1], this is not acceptable.  Without either
those new classes or the two hacks I describe below, the grammar has the
following shift/reduce conflict:

State 6220

  2331 json_table_column_definition: NESTED . path_opt Sconst COLUMNS '(' 
json_table_column_definition_list ')'
  2332                             | NESTED . path_opt Sconst AS name COLUMNS 
'(' json_table_column_definition_list ')'
  2636 unreserved_keyword: NESTED .

    PATH  shift, and go to state 6286

    SCONST    reduce using rule 2336 (path_opt)
    PATH      [reduce using rule 2636 (unreserved_keyword)]
    $default  reduce using rule 2636 (unreserved_keyword)

    path_opt  go to state 6287



First, while the grammar uses "NESTED path_opt" in the relevant productions, I
noticed that there's no test that uses NESTED without PATH, so if we break that
case, we won't notice.  I propose we remove the PATH keyword from one of
the tests in jsonb_sqljson.sql in order to make sure the grammar
continues to work after whatever hacking we do:

diff --git a/src/test/regress/expected/jsonb_sqljson.out 
b/src/test/regress/expected/jsonb_sqljson.out
index 7e8ae6a696..8fd2385cdc 100644
--- a/src/test/regress/expected/jsonb_sqljson.out
+++ b/src/test/regress/expected/jsonb_sqljson.out
@@ -1548,7 +1548,7 @@ HINT:  JSON_TABLE column names must be distinct from one 
another.
 SELECT * FROM JSON_TABLE(
        jsonb 'null', '$[*]' AS p0
        COLUMNS (
-               NESTED PATH '$' AS p1 COLUMNS (
+               NESTED '$' AS p1 COLUMNS (
                        NESTED PATH '$' AS p11 COLUMNS ( foo int ),
                        NESTED PATH '$' AS p12 COLUMNS ( bar int )
                ),
diff --git a/src/test/regress/sql/jsonb_sqljson.sql 
b/src/test/regress/sql/jsonb_sqljson.sql
index ea5db88b40..ea9b4ff8b6 100644
--- a/src/test/regress/sql/jsonb_sqljson.sql
+++ b/src/test/regress/sql/jsonb_sqljson.sql
@@ -617,7 +617,7 @@ SELECT * FROM JSON_TABLE(
 SELECT * FROM JSON_TABLE(
        jsonb 'null', '$[*]' AS p0
        COLUMNS (
-               NESTED PATH '$' AS p1 COLUMNS (
+               NESTED '$' AS p1 COLUMNS (
                        NESTED PATH '$' AS p11 COLUMNS ( foo int ),
                        NESTED PATH '$' AS p12 COLUMNS ( bar int )
                ),


Having done that, AFAICS there are two possible fixes for the grammar.
One is to keep the idea of assigning precedence explicitly to these
keywords, but do something less hackish -- we can put NESTED together
with UNBOUNDED, and classify PATH in the IDENT group.  This requires no
further changes.  This would make NESTED PATH follow the same rationale
as UNBOUNDED FOLLOWING / UNBOUNDED PRECEDING.  Here's is a preliminary
patch for that (the large comment above needs to be updated.)

diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index c15fcf2eb2..1493ac7580 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -887,9 +887,9 @@ static Node *makeRecursiveViewSelect(char *relname, List 
*aliases, Node *query);
  * json_predicate_type_constraint and json_key_uniqueness_constraint_opt
  * productions (see comments there).
  */
-%nonassoc      UNBOUNDED               /* ideally would have same precedence 
as IDENT */
+%nonassoc      UNBOUNDED NESTED                /* ideally would have same 
precedence as IDENT */
 %nonassoc      IDENT PARTITION RANGE ROWS GROUPS PRECEDING FOLLOWING CUBE 
ROLLUP
-                       SET KEYS OBJECT_P SCALAR VALUE_P WITH WITHOUT
+                       SET KEYS OBJECT_P SCALAR VALUE_P WITH WITHOUT PATH
 %left          Op OPERATOR             /* multi-character ops and user-defined 
operators */
 %left          '+' '-'
 %left          '*' '/' '%'
@@ -911,8 +911,6 @@ static Node *makeRecursiveViewSelect(char *relname, List 
*aliases, Node *query);
  */
 %left          JOIN CROSS LEFT FULL RIGHT INNER_P NATURAL
 
-%nonassoc      NESTED
-%left          PATH
 %%
 
 /*


The other thing we can do is use the two-token lookahead trick, by
declaring
%token NESTED_LA
and using the parser.c code to replace NESTED with NESTED_LA when it is
followed by PATH.  This doesn't require assigning precedence to
anything.  We do need to expand the two rules that have "NESTED
opt_path Sconst" to each be two rules, one for "NESTED_LA PATH Sconst"
and another for "NESTED Sconst".  So the opt_path production goes away.
This preliminary patch does that. (I did not touch the ecpg grammar, but
it needs an update too.)

diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index c15fcf2eb2..8e4c1d4ebe 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -817,7 +817,7 @@ static Node *makeRecursiveViewSelect(char *relname, List 
*aliases, Node *query);
  * FORMAT_LA, NULLS_LA, WITH_LA, and WITHOUT_LA are needed to make the grammar
  * LALR(1).
  */
-%token         FORMAT_LA NOT_LA NULLS_LA WITH_LA WITHOUT_LA
+%token         FORMAT_LA NESTED_LA NOT_LA NULLS_LA WITH_LA WITHOUT_LA
 
 /*
  * The grammar likewise thinks these tokens are keywords, but they are never
@@ -911,8 +911,6 @@ static Node *makeRecursiveViewSelect(char *relname, List 
*aliases, Node *query);
  */
 %left          JOIN CROSS LEFT FULL RIGHT INNER_P NATURAL
 
-%nonassoc      NESTED
-%left          PATH
 %%
 
 /*
@@ -16771,7 +16769,7 @@ json_table_column_definition:
                                        n->location = @1;
                                        $$ = (Node *) n;
                                }
-                       | NESTED path_opt Sconst
+                       | NESTED_LA PATH Sconst
                                COLUMNS '('     
json_table_column_definition_list ')'
                                {
                                        JsonTableColumn *n = 
makeNode(JsonTableColumn);
@@ -16783,7 +16781,19 @@ json_table_column_definition:
                                        n->location = @1;
                                        $$ = (Node *) n;
                                }
-                       | NESTED path_opt Sconst AS name
+                       | NESTED Sconst
+                               COLUMNS '('     
json_table_column_definition_list ')'
+                               {
+                                       JsonTableColumn *n = 
makeNode(JsonTableColumn);
+
+                                       n->coltype = JTC_NESTED;
+                                       n->pathspec = $2;
+                                       n->pathname = NULL;
+                                       n->columns = $5;
+                                       n->location = @1;
+                                       $$ = (Node *) n;
+                               }
+                       | NESTED_LA PATH Sconst AS name
                                COLUMNS '('     
json_table_column_definition_list ')'
                                {
                                        JsonTableColumn *n = 
makeNode(JsonTableColumn);
@@ -16795,6 +16805,19 @@ json_table_column_definition:
                                        n->location = @1;
                                        $$ = (Node *) n;
                                }
+                       | NESTED Sconst AS name
+                               COLUMNS '('     
json_table_column_definition_list ')'
+                               {
+                                       JsonTableColumn *n = 
makeNode(JsonTableColumn);
+
+                                       n->coltype = JTC_NESTED;
+                                       n->pathspec = $2;
+                                       n->pathname = $4;
+                                       n->columns = $7;
+                                       n->location = @1;
+                                       $$ = (Node *) n;
+                               }
+
                ;
 
 json_table_column_path_specification_clause_opt:
@@ -16802,11 +16825,6 @@ json_table_column_path_specification_clause_opt:
                        | /* EMPTY */                                           
        { $$ = NULL; }
                ;
 
-path_opt:
-                       PATH                                                    
        { }
-                       | /* EMPTY */                                           
{ }
-               ;
-
 json_table_plan_clause_opt:
                        PLAN '(' json_table_plan ')'                    { $$ = 
$3; }
                        | PLAN DEFAULT '(' json_table_default_plan_choices ')'
diff --git a/src/backend/parser/parser.c b/src/backend/parser/parser.c
index e17c310cc1..e3092f2c3e 100644
--- a/src/backend/parser/parser.c
+++ b/src/backend/parser/parser.c
@@ -138,6 +138,7 @@ base_yylex(YYSTYPE *lvalp, YYLTYPE *llocp, core_yyscan_t 
yyscanner)
        switch (cur_token)
        {
                case FORMAT:
+               case NESTED:
                        cur_token_length = 6;
                        break;
                case NOT:
@@ -204,6 +205,16 @@ base_yylex(YYSTYPE *lvalp, YYLTYPE *llocp, core_yyscan_t 
yyscanner)
                        }
                        break;
 
+               case NESTED:
+                       /* Replace NESTED by NESTED_LA if it's followed by PATH 
*/
+                       switch (next_token)
+                       {
+                               case PATH:
+                                       cur_token = NESTED_LA;
+                                       break;
+                       }
+                       break;
+
                case NOT:
                        /* Replace NOT by NOT_LA if it's followed by BETWEEN, 
IN, etc */
                        switch (next_token)


I don't know which of the two "fixes" is less bad.  Like Amit, I was not
able to find a solution to the problem by merely attaching precedences
to rules.  (I did not try to mess with the precedence of
unreserved_keyword, because I'm pretty sure that would not be a good
solution even if I could make it work.)

[1] 
https://postgr.es/m/CADT4RqBPdbsZW7HS1jJP319TMRHs1hzUiP=irjyr6uqghcr...@mail.gmail.com

-- 
Álvaro Herrera               48°01'N 7°57'E  —  https://www.EnterpriseDB.com/


Reply via email to