Hi, 

I am looking at the documentation and how jsonpath works and I fall on the 
following example:

(On PostgreSQL 12.4, package for Debian 12.4-3):

SELECT jsonb_path_query(j, '$.**.a.**.b') FROM (SELECT 
'{"a":{"a":{"b":{"c":3}}}'::jsonb as j) as T;

The result:

 jsonb_path_query 
------------------
 {"c": 3}
 {"c": 3}
(2 lignes)


>From what in understand, it matches twice the last `b` with every `a` before. 
>For instance, adding one more 'a' on the tree:

 SELECT jsonb_path_query(j, '$.**.a.**.b') FROM (SELECT 
'{"a":{"a":{"a":{"b":{"c":3}}}}}'::jsonb as j) as T;
 jsonb_path_query 
------------------
 {"c": 3}
 {"c": 3}
 {"c": 3}

This could be a fine semantic for jsonpath query (one could call that the Bag 
semantic), 
however it implies that in some situation the number of output duplicated can 
become huge.
Actually in the worst case, it could be exponential in the depth of the tree.

On more example (for the fun):

SELECT Count(*) FROM (SELECT jsonb_path_query(j, 
'$.**.a.**.a.**.a.**.a.**.a.**.a.**.a.**.a.**.a.**.a.**.a.**.a.**.a.**.b') FROM 
(SELECT 
'{"a":{"a":{"a":{"a":{"a":{"a":{"a":{"a":{"a":{"a":{"a":{"a":{"a":{"a":{"a":{"a":{"a":{"a":{"a":{"a":{"a":{"a":{"a":{"a":{"a":{"a":{"a":{"b":{"c":3}}}}}}}}}}}}}}}}}}}}}}}}}}}}}'::jsonb
 as j) as T) as T2;

Takes 24s a returns 20058300 rows.

This semantic for jsonpath is not so classical (other implementation choice to 
return only once each selected element, in the SET semantic spirit) 
and differs from the semantic of xpath, as shown here, which follows the SET 
semantic:

SELECT xpath('//a//a//b', '<a><a><a><b>1</b></a></a></a>');

Returns: 

   xpath    
------------
 {<b>1</b>}


We can of course get the appropriate semantic by adding DISTINCT, but the 
initial list of rows is computed nevertheless (a O(n) algorithm exists, so it 
is avoidable).

I also wonder if it is problematic to have simple small query like that that 
can be design to make the database work endlessly?

Best, 

Charles Paperman
Maitre de Conférence à l'université de Lille



Reply via email to