I recently had occasion to use the pg_parse_json() function for creating
input functions for pg_ndistinct and pg_dependencies.

While it is obvious now that I should have been parsing with
need_escapes=true, it wasn't obvious from the outset, and so I'm proposing
adding a few comments to help the next person come to that conclusion
sooner than I did.
From 84a8990622a4b83729600d9db43e8678502548f5 Mon Sep 17 00:00:00 2001
From: Corey Huinker <corey.huin...@gmail.com>
Date: Fri, 20 Dec 2024 05:08:50 -0500
Subject: [PATCH v1] Explain impact of need_escapes in JSON parsing.

Adding some additional comments to help the reader discover the
purpose and effect of the need_escapes parameter.

Nearly every new use of pg_parse_json and pg_parse_json_incremental will
need need_escapes=true.
---
 src/include/common/jsonapi.h | 11 +++++++++++
 src/common/jsonapi.c         |  7 ++++++-
 2 files changed, 17 insertions(+), 1 deletion(-)

diff --git a/src/include/common/jsonapi.h b/src/include/common/jsonapi.h
index 167615a557..fa19005822 100644
--- a/src/include/common/jsonapi.h
+++ b/src/include/common/jsonapi.h
@@ -118,6 +118,11 @@ typedef struct JsonLexContext
 	struct jsonapi_StrValType *errormsg;
 } JsonLexContext;
 
+/*
+ * Function types for custom json parsing actions.
+ *
+ * fname and token will always be NULL if the context has need_escapes=false.
+ */
 typedef JsonParseErrorType (*json_struct_action) (void *state);
 typedef JsonParseErrorType (*json_ofield_action) (void *state, char *fname, bool isnull);
 typedef JsonParseErrorType (*json_aelem_action) (void *state, bool isnull);
@@ -197,6 +202,12 @@ extern JsonParseErrorType json_count_array_elements(JsonLexContext *lex,
  * struct is allocated.
  *
  * If need_escapes is true, ->strval stores the unescaped lexemes.
+ *
+ * Setting need_escapes to true is necessary if the operation needs
+ * to reference field names or scalar values. This is true of any
+ * operation beyond purely checking the json-validaity of the source
+ * document.
+ *
  * Unescaping is expensive, so only request it when necessary.
  *
  * If need_escapes is true or lex was given as NULL, then the caller is
diff --git a/src/common/jsonapi.c b/src/common/jsonapi.c
index 0e2a82ad7a..307182e478 100644
--- a/src/common/jsonapi.c
+++ b/src/common/jsonapi.c
@@ -1297,7 +1297,10 @@ parse_scalar(JsonLexContext *lex, const JsonSemAction *sem)
 		return result;
 	}
 
-	/* invoke the callback, which may take ownership of val */
+	/*
+	 * invoke the callback, which may take ownership of val.
+	 * val always NULL when need_escapes=false
+	 */
 	result = (*sfunc) (sem->semstate, val, tok);
 
 	if (lex->flags & JSONLEX_CTX_OWNS_TOKENS)
@@ -1349,6 +1352,7 @@ parse_object_field(JsonLexContext *lex, const JsonSemAction *sem)
 
 	if (ostart != NULL)
 	{
+		/* fname always NULL when need_escapes=false */
 		result = (*ostart) (sem->semstate, fname, isnull);
 		if (result != JSON_SUCCESS)
 			goto ofield_cleanup;
@@ -1370,6 +1374,7 @@ parse_object_field(JsonLexContext *lex, const JsonSemAction *sem)
 
 	if (oend != NULL)
 	{
+		/* fname always NULL when need_escapes=false */
 		result = (*oend) (sem->semstate, fname, isnull);
 		if (result != JSON_SUCCESS)
 			goto ofield_cleanup;
-- 
2.47.1

Reply via email to