Hello Hackers,

I’ve attached a patch to start adding SQL:2023 JSON simplified
accessor support. This allows accessing JSON or JSONB fields using dot
notation (e.g., colname.field.field...), similar to composite types.

Currently, PostgreSQL uses nonstandard syntax like colname->x->y for
JSON and JSONB, and colname['blah'] for JSONB. These existing syntaxes
predate the standard. Oracle already supports the standard dot
notation syntax [1].

The full specification for the JSON simplified accessor format is as
follows:

<JSON simplified accessor> ::=
  <value expression primary> <JSON simplified accessor op chain>
<JSON simplified accessor op chain> ::=
    <JSON simplified accessor op>
  | <JSON simplified accessor op chain> <JSON simplified accessor op>
<JSON simplified accessor op> ::=
    <JSON member accessor>
  | <JSON wildcard member accessor>
  | <JSON array accessor>
  | <JSON wildcard array accessor>
  | <JSON item method>

I’ve implemented the member and array accessors and attached two
alternative patches:

1. v1-0001-Add-JSON-JSONB-simplified-accessor.patch: This patch
enables dot access to JSON object fields and subscript access to
indexed JSON array elements by converting "." and "[]" indirection
into a JSON_QUERY JsonFuncExpr node.

2. v2-0001-Transform-JSON-dot-access-to-arrow-operator.txt: This
alternative patch implements dot access to JSON object fields by
transforming the "." indirection into a "->" operator.

The upside of the v1 patch is that it strictly aligns with the SQL
standard, which specifies that the simplified access is equivalent to:

JSON_QUERY (VEP, 'lax $.JC' WITH CONDITIONAL ARRAY WRAPPER NULL ON
EMPTY NULL ON ERROR)

However, the performance of JSON_QUERY might be suboptimal due to
function call overhead. Therefore, I implemented the v2 alternative
using the "->" operator.

There is some uncertainty about the semantics of conditional array
wrappers. Currently, there is at least one subtle difference between
the "->" operator and JSON_QUERY, as shown:

postgres=# select '{"a": 42}'::json->'a';
 ?column?
----------
 42
(1 row)

postgres=# select json_query('{"a": 42}'::json, 'lax $.a' with
conditional array wrapper null on empty null on error);
 json_query
------------
 [42]
(1 row)

JSON_QUERY encloses the JSON value 42 in brackets, which may be a bug,
as Peter noted [2]. If there are no other semantic differences, we
could implement simple access without using JSON_QUERY to avoid
function call overhead.

I aim to first enable standard dot notation access to JSON object
fields. Both patches implement this, and I’m also open to alternative
approaches.

For subscripting access to jsonb array elements, jsonb already
supports this via the subscripting handler interface. In the v1 patch,
I added json support using JSON_QUERY, but I can easily adapt this for
the v2 patch using the -> operator. I did not leverage the
subscripting handler interface for json because implementing the
fetch/assign functions for json seems challenging for plain text. Let
me know if you have a different approach in mind.

Finally, I have not implemented wildcard or item method accessors yet
and would appreciate input on their necessity.

[1] 
https://docs.oracle.com/en/database/oracle/oracle-database/21/adjsn/simple-dot-notation-access-to-json-data.html#GUID-7249417B-A337-4854-8040-192D5CEFD576
[2] 
https://www.postgresql.org/message-id/8022e067-818b-45d3-8fab-6e0d94d03...@eisentraut.org
From 99ee1cbc45ebd76ce21881bda95933d8a5a104c6 Mon Sep 17 00:00:00 2001
From: Alexandra Wang <alexandra.wang....@gmail.com>
Date: Thu, 15 Aug 2024 02:11:33 -0700
Subject: [PATCH v2] Transform JSON dot access to arrow operator

Enabled dot-notation access to JSON/JSONB object by making a syntatic
sugar for the "->" operator in ParseFuncOrColumn() for arg of
JSON/JSONB type.

JSON array access via subscripting is not yet supported in this patch,
but can be implemented similarly by creating an OpExpr for the
json_array_element "->" operator.

Note that the output of the "->" operators are not wrapped by
brackets, which differs from the SQL standard specification for the
JSON simplified accessor equivalence shown below:

JSON_QUERY (VEP, 'lax $.JC' WITH CONDITIONAL ARRAY WRAPPER NULL ON EMPTY NULL 
ON ERROR)
---
 src/backend/parser/parse_func.c     | 57 +++++++++++++++++++++---
 src/include/catalog/pg_operator.dat |  4 +-
 src/include/parser/parse_type.h     |  1 +
 src/test/regress/expected/json.out  | 67 +++++++++++++++++++++++++++++
 src/test/regress/expected/jsonb.out | 55 +++++++++++++++++++++++
 src/test/regress/sql/json.sql       | 20 +++++++++
 src/test/regress/sql/jsonb.sql      | 17 ++++++++
 7 files changed, 214 insertions(+), 7 deletions(-)

diff --git a/src/backend/parser/parse_func.c b/src/backend/parser/parse_func.c
index 9b23344a3b..431c9883f2 100644
--- a/src/backend/parser/parse_func.c
+++ b/src/backend/parser/parse_func.c
@@ -33,6 +33,8 @@
 #include "utils/builtins.h"
 #include "utils/lsyscache.h"
 #include "utils/syscache.h"
+#include "parser/parse_oper.h"
+#include "catalog/pg_operator_d.h"
 
 
 /* Possible error codes from LookupFuncNameInternal */
@@ -48,6 +50,8 @@ static void unify_hypothetical_args(ParseState *pstate,
 static Oid     FuncNameAsType(List *funcname);
 static Node *ParseComplexProjection(ParseState *pstate, const char *funcname,
                                                                        Node 
*first_arg, int location);
+static Node *ParseJsonSimplifiedAccessorProjection(ParseState *pstate, const 
char *funcname,
+                                                                       Node 
*first_arg, int location);
 static Oid     LookupFuncNameInternal(ObjectType objtype, List *funcname,
                                                                   int nargs, 
const Oid *argtypes,
                                                                   bool 
include_out_arguments, bool missing_ok,
@@ -226,17 +230,24 @@ ParseFuncOrColumn(ParseState *pstate, List *funcname, 
List *fargs,
                                                   !func_variadic && argnames 
== NIL &&
                                                   list_length(funcname) == 1 &&
                                                   (actual_arg_types[0] == 
RECORDOID ||
-                                                       
ISCOMPLEX(actual_arg_types[0])));
+                                                       
ISCOMPLEX(actual_arg_types[0]) ||
+                                                       
ISJSON(actual_arg_types[0])));
 
        /*
         * If it's column syntax, check for column projection case first.
         */
        if (could_be_projection && is_column)
        {
-               retval = ParseComplexProjection(pstate,
-                                                                               
strVal(linitial(funcname)),
-                                                                               
first_arg,
-                                                                               
location);
+               if (ISJSON(actual_arg_types[0]))
+                       retval = ParseJsonSimplifiedAccessorProjection(pstate,
+                                                                               
        strVal(linitial(funcname)),
+                                                                               
        first_arg,
+                                                                               
        location);
+               else
+                       retval = ParseComplexProjection(pstate,
+                                                                               
        strVal(linitial(funcname)),
+                                                                               
        first_arg,
+                                                                               
        location);
                if (retval)
                        return retval;
 
@@ -1902,6 +1913,42 @@ FuncNameAsType(List *funcname)
        return result;
 }
 
+/*
+ * ParseJsonSimplifiedAccessorProjection -
+ *       handles function calls with a single argument that is of json type.
+ *       If the function call is actually a column projection, return a 
suitably
+ *       transformed expression tree.  If not, return NULL.
+ */
+static Node *
+ParseJsonSimplifiedAccessorProjection(ParseState *pstate, const char *funcname,
+                                                                         Node 
*first_arg, int location)
+{
+       OpExpr  *result;
+       Node    *rexpr;
+       rexpr = (Node *) makeConst(
+                       TEXTOID,
+                       -1,
+                       InvalidOid,
+                       -1,
+                       CStringGetTextDatum(funcname),
+                       false,
+                       false);
+
+       result = makeNode(OpExpr);
+       if (exprType(first_arg) == JSONOID) {
+               result->opno = OID_JSON_OBJECT_FIELD_OP;
+               result->opresulttype = JSONOID;
+       } else {
+               Assert(exprType(first_arg) == JSONBOID);
+               result->opno = OID_JSONB_OBJECT_FIELD_OP;
+               result->opresulttype = JSONBOID;
+       }
+       result->opfuncid = get_opcode(result->opno);
+       result->args = list_make2(first_arg, rexpr);
+       result->location = location;
+       return (Node *) result;
+}
+
 /*
  * ParseComplexProjection -
  *       handles function calls with a single argument that is of complex type.
diff --git a/src/include/catalog/pg_operator.dat 
b/src/include/catalog/pg_operator.dat
index 0e7511dde1..0ef9d80357 100644
--- a/src/include/catalog/pg_operator.dat
+++ b/src/include/catalog/pg_operator.dat
@@ -3154,7 +3154,7 @@
   oprname => '*', oprleft => 'anyrange', oprright => 'anyrange',
   oprresult => 'anyrange', oprcom => '*(anyrange,anyrange)',
   oprcode => 'range_intersect' },
-{ oid => '3962', descr => 'get json object field',
+{ oid => '3962', oid_symbol => 'OID_JSON_OBJECT_FIELD_OP', descr => 'get json 
object field',
   oprname => '->', oprleft => 'json', oprright => 'text', oprresult => 'json',
   oprcode => 'json_object_field' },
 { oid => '3963', descr => 'get json object field as text',
@@ -3172,7 +3172,7 @@
 { oid => '3967', descr => 'get value from json as text with path elements',
   oprname => '#>>', oprleft => 'json', oprright => '_text', oprresult => 
'text',
   oprcode => 'json_extract_path_text' },
-{ oid => '3211', descr => 'get jsonb object field',
+{ oid => '3211', oid_symbol => 'OID_JSONB_OBJECT_FIELD_OP', descr => 'get 
jsonb object field',
   oprname => '->', oprleft => 'jsonb', oprright => 'text', oprresult => 
'jsonb',
   oprcode => 'jsonb_object_field' },
 { oid => '3477', descr => 'get jsonb object field as text',
diff --git a/src/include/parser/parse_type.h b/src/include/parser/parse_type.h
index b62e7a6ce9..9c8b3bfb2f 100644
--- a/src/include/parser/parse_type.h
+++ b/src/include/parser/parse_type.h
@@ -57,5 +57,6 @@ extern bool parseTypeString(const char *str, Oid *typeid_p, 
int32 *typmod_p,
 
 /* true if typeid is composite, or domain over composite, but not RECORD */
 #define ISCOMPLEX(typeid) (typeOrDomainTypeRelid(typeid) != InvalidOid)
+#define ISJSON(typeid) (typeid == JSONOID || typeid == JSONBOID)
 
 #endif                                                 /* PARSE_TYPE_H */
diff --git a/src/test/regress/expected/json.out 
b/src/test/regress/expected/json.out
index 7df11c2f38..39bfa724c2 100644
--- a/src/test/regress/expected/json.out
+++ b/src/test/regress/expected/json.out
@@ -2713,3 +2713,70 @@ select ts_headline('[]'::json, tsquery('aaa & bbb'));
  []
 (1 row)
 
+-- simple dot notation
+drop table if exists test_json_dot;
+NOTICE:  table "test_json_dot" does not exist, skipping
+create table test_json_dot(id int, test_json json);
+insert into test_json_dot select 1, '{"a": 1, "b": 42}'::json;
+insert into test_json_dot select 1, '{"a": 2, "b": {"c": 42}}'::json;
+insert into test_json_dot select 1, '{"a": 3, "b": {"c": "42"}, "d":[11, 
12]}'::json;
+insert into test_json_dot select 1, '{"a": 3, "b": {"c": "42"}, "d":[{"x": 
[11, 12]}, {"y": [21, 22]}]}'::json;
+-- member object access
+select (test_json_dot.test_json).b, json_query(test_json, 'lax $.b' WITH 
CONDITIONAL WRAPPER NULL ON EMPTY NULL ON ERROR) as expected from test_json_dot;
+      b      |  expected   
+-------------+-------------
+ 42          | [42]
+ {"c": 42}   | {"c": 42}
+ {"c": "42"} | {"c": "42"}
+ {"c": "42"} | {"c": "42"}
+(4 rows)
+
+select (test_json_dot.test_json).b.c, json_query(test_json, 'lax $.b.c' WITH 
CONDITIONAL WRAPPER NULL ON EMPTY NULL ON ERROR) as expected from test_json_dot;
+  c   | expected 
+------+----------
+      | 
+ 42   | [42]
+ "42" | ["42"]
+ "42" | ["42"]
+(4 rows)
+
+select (test_json_dot.test_json).d, json_query(test_json, 'lax $.d' WITH 
CONDITIONAL WRAPPER NULL ON EMPTY NULL ON ERROR) as expected from test_json_dot;
+                 d                  |              expected              
+------------------------------------+------------------------------------
+                                    | 
+                                    | 
+ [11, 12]                           | [11, 12]
+ [{"x": [11, 12]}, {"y": [21, 22]}] | [{"x": [11, 12]}, {"y": [21, 22]}]
+(4 rows)
+
+select (test_json_dot.test_json)."d", json_query(test_json, 'lax $.d' WITH 
CONDITIONAL WRAPPER NULL ON EMPTY NULL ON ERROR) as expected from test_json_dot;
+                 d                  |              expected              
+------------------------------------+------------------------------------
+                                    | 
+                                    | 
+ [11, 12]                           | [11, 12]
+ [{"x": [11, 12]}, {"y": [21, 22]}] | [{"x": [11, 12]}, {"y": [21, 22]}]
+(4 rows)
+
+select (test_json_dot.test_json).'d' from test_json_dot;
+ERROR:  syntax error at or near "'d'"
+LINE 1: select (test_json_dot.test_json).'d' from test_json_dot;
+                                         ^
+-- array element access
+select (test_json_dot.test_json).d->0 from test_json_dot;
+    ?column?     
+-----------------
+ 
+ 
+ 11
+ {"x": [11, 12]}
+(4 rows)
+
+select (test_json_dot.test_json).d[0], json_query(test_json, 'lax $.d[0]' WITH 
CONDITIONAL WRAPPER NULL ON EMPTY NULL ON ERROR) as expected from test_json_dot;
+ERROR:  cannot subscript type json because it does not support subscripting
+LINE 1: select (test_json_dot.test_json).d[0], json_query(test_json,...
+                ^
+select (test_json_dot.test_json).d[1], json_query(test_json, 'lax $.d[1]' WITH 
CONDITIONAL WRAPPER NULL ON EMPTY NULL ON ERROR) as expected from test_json_dot;
+ERROR:  cannot subscript type json because it does not support subscripting
+LINE 1: select (test_json_dot.test_json).d[1], json_query(test_json,...
+                ^
diff --git a/src/test/regress/expected/jsonb.out 
b/src/test/regress/expected/jsonb.out
index 7d163a156e..c30f3e174c 100644
--- a/src/test/regress/expected/jsonb.out
+++ b/src/test/regress/expected/jsonb.out
@@ -5715,3 +5715,58 @@ select 
'12345.0000000000000000000000000000000000000000000005'::jsonb::int8;
  12345
 (1 row)
 
+-- simple dot notation
+drop table if exists test_jsonb_dot;
+NOTICE:  table "test_jsonb_dot" does not exist, skipping
+create table test_jsonb_dot(id int, test_jsonb jsonb);
+insert into test_jsonb_dot select 1, '{"a": 1, "b": 42}'::json;
+insert into test_jsonb_dot select 1, '{"a": 2, "b": {"c": 42}}'::json;
+insert into test_jsonb_dot select 1, '{"a": 3, "b": {"c": "42"}, "d":[11, 
12]}'::json;
+-- member object access
+select (test_jsonb_dot.test_jsonb).b from test_jsonb_dot;
+      b      
+-------------
+ 42
+ {"c": 42}
+ {"c": "42"}
+(3 rows)
+
+select (test_jsonb_dot.test_jsonb).b.c from test_jsonb_dot;
+  c   
+------
+ 
+ 42
+ "42"
+(3 rows)
+
+select (test_json_dot.test_json).d from test_json_dot;
+                 d                  
+------------------------------------
+ 
+ 
+ [11, 12]
+ [{"x": [11, 12]}, {"y": [21, 22]}]
+(4 rows)
+
+select (test_json_dot.test_json)."d" from test_json_dot;
+                 d                  
+------------------------------------
+ 
+ 
+ [11, 12]
+ [{"x": [11, 12]}, {"y": [21, 22]}]
+(4 rows)
+
+select (test_json_dot.test_json).'d' from test_json_dot;
+ERROR:  syntax error at or near "'d'"
+LINE 1: select (test_json_dot.test_json).'d' from test_json_dot;
+                                         ^
+-- array element access
+select (test_jsonb_dot.test_jsonb).d[0] from test_jsonb_dot;
+ d  
+----
+ 
+ 
+ 11
+(3 rows)
+
diff --git a/src/test/regress/sql/json.sql b/src/test/regress/sql/json.sql
index 5c886cd6b3..f9b2f7a35d 100644
--- a/src/test/regress/sql/json.sql
+++ b/src/test/regress/sql/json.sql
@@ -867,3 +867,23 @@ select ts_headline('english', '{"a": "aaa bbb", "b": {"c": 
"ccc ddd fff", "c1":
 select ts_headline('null'::json, tsquery('aaa & bbb'));
 select ts_headline('{}'::json, tsquery('aaa & bbb'));
 select ts_headline('[]'::json, tsquery('aaa & bbb'));
+
+-- simple dot notation
+drop table if exists test_json_dot;
+create table test_json_dot(id int, test_json json);
+insert into test_json_dot select 1, '{"a": 1, "b": 42}'::json;
+insert into test_json_dot select 1, '{"a": 2, "b": {"c": 42}}'::json;
+insert into test_json_dot select 1, '{"a": 3, "b": {"c": "42"}, "d":[11, 
12]}'::json;
+insert into test_json_dot select 1, '{"a": 3, "b": {"c": "42"}, "d":[{"x": 
[11, 12]}, {"y": [21, 22]}]}'::json;
+
+-- member object access
+select (test_json_dot.test_json).b, json_query(test_json, 'lax $.b' WITH 
CONDITIONAL WRAPPER NULL ON EMPTY NULL ON ERROR) as expected from test_json_dot;
+select (test_json_dot.test_json).b.c, json_query(test_json, 'lax $.b.c' WITH 
CONDITIONAL WRAPPER NULL ON EMPTY NULL ON ERROR) as expected from test_json_dot;
+select (test_json_dot.test_json).d, json_query(test_json, 'lax $.d' WITH 
CONDITIONAL WRAPPER NULL ON EMPTY NULL ON ERROR) as expected from test_json_dot;
+select (test_json_dot.test_json)."d", json_query(test_json, 'lax $.d' WITH 
CONDITIONAL WRAPPER NULL ON EMPTY NULL ON ERROR) as expected from test_json_dot;
+select (test_json_dot.test_json).'d' from test_json_dot;
+
+-- array element access
+select (test_json_dot.test_json).d->0 from test_json_dot;
+select (test_json_dot.test_json).d[0], json_query(test_json, 'lax $.d[0]' WITH 
CONDITIONAL WRAPPER NULL ON EMPTY NULL ON ERROR) as expected from test_json_dot;
+select (test_json_dot.test_json).d[1], json_query(test_json, 'lax $.d[1]' WITH 
CONDITIONAL WRAPPER NULL ON EMPTY NULL ON ERROR) as expected from test_json_dot;
diff --git a/src/test/regress/sql/jsonb.sql b/src/test/regress/sql/jsonb.sql
index 5f0190d5a2..f2b800910d 100644
--- a/src/test/regress/sql/jsonb.sql
+++ b/src/test/regress/sql/jsonb.sql
@@ -1559,3 +1559,20 @@ select 
'12345.0000000000000000000000000000000000000000000005'::jsonb::float8;
 select '12345.0000000000000000000000000000000000000000000005'::jsonb::int2;
 select '12345.0000000000000000000000000000000000000000000005'::jsonb::int4;
 select '12345.0000000000000000000000000000000000000000000005'::jsonb::int8;
+
+-- simple dot notation
+drop table if exists test_jsonb_dot;
+create table test_jsonb_dot(id int, test_jsonb jsonb);
+insert into test_jsonb_dot select 1, '{"a": 1, "b": 42}'::json;
+insert into test_jsonb_dot select 1, '{"a": 2, "b": {"c": 42}}'::json;
+insert into test_jsonb_dot select 1, '{"a": 3, "b": {"c": "42"}, "d":[11, 
12]}'::json;
+
+-- member object access
+select (test_jsonb_dot.test_jsonb).b from test_jsonb_dot;
+select (test_jsonb_dot.test_jsonb).b.c from test_jsonb_dot;
+select (test_json_dot.test_json).d from test_json_dot;
+select (test_json_dot.test_json)."d" from test_json_dot;
+select (test_json_dot.test_json).'d' from test_json_dot;
+
+-- array element access
+select (test_jsonb_dot.test_jsonb).d[0] from test_jsonb_dot;
-- 
2.39.3 (Apple Git-146)

Attachment: v1-0001-Add-JSON-JSONB-simplified-accessor.patch
Description: Binary data

Reply via email to