On 2023-Dec-05, Amit Langote wrote: > I've attempted to trim down the JSON_TABLE grammar (0004), but this is > all I've managed so far. Among other things, I couldn't refactor the > grammar to do away with the following: > > +%nonassoc NESTED > +%left PATH
To recap, the reason we're arguing about this is that this creates two new precedence classes, which are higher than everything else. Judging by the discussios in thread [1], this is not acceptable. Without either those new classes or the two hacks I describe below, the grammar has the following shift/reduce conflict: State 6220 2331 json_table_column_definition: NESTED . path_opt Sconst COLUMNS '(' json_table_column_definition_list ')' 2332 | NESTED . path_opt Sconst AS name COLUMNS '(' json_table_column_definition_list ')' 2636 unreserved_keyword: NESTED . PATH shift, and go to state 6286 SCONST reduce using rule 2336 (path_opt) PATH [reduce using rule 2636 (unreserved_keyword)] $default reduce using rule 2636 (unreserved_keyword) path_opt go to state 6287 First, while the grammar uses "NESTED path_opt" in the relevant productions, I noticed that there's no test that uses NESTED without PATH, so if we break that case, we won't notice. I propose we remove the PATH keyword from one of the tests in jsonb_sqljson.sql in order to make sure the grammar continues to work after whatever hacking we do: diff --git a/src/test/regress/expected/jsonb_sqljson.out b/src/test/regress/expected/jsonb_sqljson.out index 7e8ae6a696..8fd2385cdc 100644 --- a/src/test/regress/expected/jsonb_sqljson.out +++ b/src/test/regress/expected/jsonb_sqljson.out @@ -1548,7 +1548,7 @@ HINT: JSON_TABLE column names must be distinct from one another. SELECT * FROM JSON_TABLE( jsonb 'null', '$[*]' AS p0 COLUMNS ( - NESTED PATH '$' AS p1 COLUMNS ( + NESTED '$' AS p1 COLUMNS ( NESTED PATH '$' AS p11 COLUMNS ( foo int ), NESTED PATH '$' AS p12 COLUMNS ( bar int ) ), diff --git a/src/test/regress/sql/jsonb_sqljson.sql b/src/test/regress/sql/jsonb_sqljson.sql index ea5db88b40..ea9b4ff8b6 100644 --- a/src/test/regress/sql/jsonb_sqljson.sql +++ b/src/test/regress/sql/jsonb_sqljson.sql @@ -617,7 +617,7 @@ SELECT * FROM JSON_TABLE( SELECT * FROM JSON_TABLE( jsonb 'null', '$[*]' AS p0 COLUMNS ( - NESTED PATH '$' AS p1 COLUMNS ( + NESTED '$' AS p1 COLUMNS ( NESTED PATH '$' AS p11 COLUMNS ( foo int ), NESTED PATH '$' AS p12 COLUMNS ( bar int ) ), Having done that, AFAICS there are two possible fixes for the grammar. One is to keep the idea of assigning precedence explicitly to these keywords, but do something less hackish -- we can put NESTED together with UNBOUNDED, and classify PATH in the IDENT group. This requires no further changes. This would make NESTED PATH follow the same rationale as UNBOUNDED FOLLOWING / UNBOUNDED PRECEDING. Here's is a preliminary patch for that (the large comment above needs to be updated.) diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y index c15fcf2eb2..1493ac7580 100644 --- a/src/backend/parser/gram.y +++ b/src/backend/parser/gram.y @@ -887,9 +887,9 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query); * json_predicate_type_constraint and json_key_uniqueness_constraint_opt * productions (see comments there). */ -%nonassoc UNBOUNDED /* ideally would have same precedence as IDENT */ +%nonassoc UNBOUNDED NESTED /* ideally would have same precedence as IDENT */ %nonassoc IDENT PARTITION RANGE ROWS GROUPS PRECEDING FOLLOWING CUBE ROLLUP - SET KEYS OBJECT_P SCALAR VALUE_P WITH WITHOUT + SET KEYS OBJECT_P SCALAR VALUE_P WITH WITHOUT PATH %left Op OPERATOR /* multi-character ops and user-defined operators */ %left '+' '-' %left '*' '/' '%' @@ -911,8 +911,6 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query); */ %left JOIN CROSS LEFT FULL RIGHT INNER_P NATURAL -%nonassoc NESTED -%left PATH %% /* The other thing we can do is use the two-token lookahead trick, by declaring %token NESTED_LA and using the parser.c code to replace NESTED with NESTED_LA when it is followed by PATH. This doesn't require assigning precedence to anything. We do need to expand the two rules that have "NESTED opt_path Sconst" to each be two rules, one for "NESTED_LA PATH Sconst" and another for "NESTED Sconst". So the opt_path production goes away. This preliminary patch does that. (I did not touch the ecpg grammar, but it needs an update too.) diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y index c15fcf2eb2..8e4c1d4ebe 100644 --- a/src/backend/parser/gram.y +++ b/src/backend/parser/gram.y @@ -817,7 +817,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query); * FORMAT_LA, NULLS_LA, WITH_LA, and WITHOUT_LA are needed to make the grammar * LALR(1). */ -%token FORMAT_LA NOT_LA NULLS_LA WITH_LA WITHOUT_LA +%token FORMAT_LA NESTED_LA NOT_LA NULLS_LA WITH_LA WITHOUT_LA /* * The grammar likewise thinks these tokens are keywords, but they are never @@ -911,8 +911,6 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query); */ %left JOIN CROSS LEFT FULL RIGHT INNER_P NATURAL -%nonassoc NESTED -%left PATH %% /* @@ -16771,7 +16769,7 @@ json_table_column_definition: n->location = @1; $$ = (Node *) n; } - | NESTED path_opt Sconst + | NESTED_LA PATH Sconst COLUMNS '(' json_table_column_definition_list ')' { JsonTableColumn *n = makeNode(JsonTableColumn); @@ -16783,7 +16781,19 @@ json_table_column_definition: n->location = @1; $$ = (Node *) n; } - | NESTED path_opt Sconst AS name + | NESTED Sconst + COLUMNS '(' json_table_column_definition_list ')' + { + JsonTableColumn *n = makeNode(JsonTableColumn); + + n->coltype = JTC_NESTED; + n->pathspec = $2; + n->pathname = NULL; + n->columns = $5; + n->location = @1; + $$ = (Node *) n; + } + | NESTED_LA PATH Sconst AS name COLUMNS '(' json_table_column_definition_list ')' { JsonTableColumn *n = makeNode(JsonTableColumn); @@ -16795,6 +16805,19 @@ json_table_column_definition: n->location = @1; $$ = (Node *) n; } + | NESTED Sconst AS name + COLUMNS '(' json_table_column_definition_list ')' + { + JsonTableColumn *n = makeNode(JsonTableColumn); + + n->coltype = JTC_NESTED; + n->pathspec = $2; + n->pathname = $4; + n->columns = $7; + n->location = @1; + $$ = (Node *) n; + } + ; json_table_column_path_specification_clause_opt: @@ -16802,11 +16825,6 @@ json_table_column_path_specification_clause_opt: | /* EMPTY */ { $$ = NULL; } ; -path_opt: - PATH { } - | /* EMPTY */ { } - ; - json_table_plan_clause_opt: PLAN '(' json_table_plan ')' { $$ = $3; } | PLAN DEFAULT '(' json_table_default_plan_choices ')' diff --git a/src/backend/parser/parser.c b/src/backend/parser/parser.c index e17c310cc1..e3092f2c3e 100644 --- a/src/backend/parser/parser.c +++ b/src/backend/parser/parser.c @@ -138,6 +138,7 @@ base_yylex(YYSTYPE *lvalp, YYLTYPE *llocp, core_yyscan_t yyscanner) switch (cur_token) { case FORMAT: + case NESTED: cur_token_length = 6; break; case NOT: @@ -204,6 +205,16 @@ base_yylex(YYSTYPE *lvalp, YYLTYPE *llocp, core_yyscan_t yyscanner) } break; + case NESTED: + /* Replace NESTED by NESTED_LA if it's followed by PATH */ + switch (next_token) + { + case PATH: + cur_token = NESTED_LA; + break; + } + break; + case NOT: /* Replace NOT by NOT_LA if it's followed by BETWEEN, IN, etc */ switch (next_token) I don't know which of the two "fixes" is less bad. Like Amit, I was not able to find a solution to the problem by merely attaching precedences to rules. (I did not try to mess with the precedence of unreserved_keyword, because I'm pretty sure that would not be a good solution even if I could make it work.) [1] https://postgr.es/m/CADT4RqBPdbsZW7HS1jJP319TMRHs1hzUiP=irjyr6uqghcr...@mail.gmail.com -- Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/