Following up on my Dec 24 email - three features completed: 1. LABELS() function - Returns text[] of element labels - Fixed privilege checking from previous version - Enables optimizer pushdown for branch pruning
2. PROPERTY_NAMES() function - Returns text[] of property names - Similar approach to LABELS() 3. Multiple path patterns - Syntax: MATCH (a)->(b), (b)->(c) - Implements SQL/PGQ standard feature (was TODO) Patches attached. Best regards, Henson Choi --- Background: Currently at Inzent (PostgreSQL-based TDE). Previously at AgensGraph (PostgreSQL-based graph DB, 2017-2020). 2025년 12월 25일 (목) AM 1:23, Henson Choi <[email protected]>님이 작성: > Subject: Re: SQL Property Graph Queries (SQL/PGQ) > > Hi hackers, > > Apologies for jumping into the SQL/PGQ discussion mid-stream. I've been > working on implementing the LABELS() graph element function and ran into > an issue I'd like to get some guidance on. > > == What LABELS() does == > > LABELS(element_variable) returns a text[] array containing all labels > associated with a graph element. For example: > > SELECT * FROM GRAPH_TABLE (myshop > MATCH (n IS customers) > COLUMNS (n.name, LABELS(n) AS lbls) > ); > > name | lbls > ------------+------------- > customer1 | {customers} > customer2 | {customers} > > == Implementation approach == > > A naive approach would return LABELS() as a constant (e.g., > ARRAY['customers'] > directly). However, this prevents the optimizer from pushing down > predicates > involving LABELS() through UNION ALL branches. > > To enable optimizer pushdown, I wrap each element table in a subquery that > adds a virtual __labels__ column containing a constant array. This way, > LABELS(n) returns a Var referencing that column, allowing the optimizer to > evaluate it per-branch and prune accordingly: > > -- Conceptually, the rewrite produces: > SELECT name, __labels__ AS lbls FROM ( > SELECT *, ARRAY['customers']::text[] AS __labels__ FROM customers > ) ... > > This allows the optimizer to perform constant folding and prune UNION ALL > branches when filtering by label: > > -- Filter in outer WHERE clause > EXPLAIN SELECT * FROM GRAPH_TABLE (myshop > MATCH (n IS lists) -- lists label shared by orders and wishlists > COLUMNS (LABELS(n) AS lbls, n.node_id) > ) WHERE 'orders' = ANY(lbls); > > QUERY PLAN > ------------------------------------------------------ > Seq Scan on orders > Output: '{orders,lists}'::text[], orders.order_id > > -- Filter in MATCH WHERE clause without IS (both tables scanned) > EXPLAIN SELECT * FROM GRAPH_TABLE (myshop > MATCH (n) WHERE 'lists' = ANY(LABELS(n)) > COLUMNS (LABELS(n) AS lbls, n.node_id) > ); > > QUERY PLAN > ------------------------------------------------------ > Append > -> Seq Scan on orders > Output: '{orders,lists}'::text[], orders.order_id > -> Seq Scan on wishlists > Output: '{lists,wishlists}'::text[], wishlists.wishlist_id > > The first example prunes the wishlists branch since 'orders' is not in its > label set. The second example uses LABELS() without an IS clause to filter > across all element tables that have the 'lists' label. > > == The problem == > > The subquery wrapping breaks column-level privilege checking. > > Test case from privileges.sql: > > -- regress_priv_user4 has SELECT on atest5(one, four) only > -- lttc label maps atest5.three -> lttck (no privilege) > > SET ROLE regress_priv_user4; > SELECT * FROM graph_table (ptg1 MATCH (v IS lttc) COLUMNS (v.lttck)); > -- Expected: ERROR: permission denied for table atest5 > -- Actual: query succeeds (returns rows without error) > > I haven't yet identified the exact root cause of this issue. Has anyone > encountered a similar issue? Any pointers would be appreciated. > > Thanks, > Henson > > 2025년 12월 18일 (목) PM 6:16, Ashutosh Bapat <[email protected]>님이 > 작성: > >> On Wed, Dec 17, 2025 at 2:28 PM Peter Eisentraut <[email protected]> >> wrote: >> > >> > On 17.12.25 06:32, Ashutosh Bapat wrote: >> > > On Mon, Dec 15, 2025 at 6:43 PM Ashutosh Bapat >> > > <[email protected]> wrote: >> > >> >> > >> Rebased patches on the latest HEAD which required me to move >> > >> graph_table.sql to another parallel group. >> > > >> > > Huh, the movement resulted in losing that test from parallel_schedule. >> > > Fixed in the attached patches. >> > >> > A couple of items: >> > >> > 1) I was running some tests that involved properties with mismatching >> > typmods, and I got an error message like >> > >> > ERROR: property "p1" data type modifier mismatch: 14 vs. 19 >> > >> > but the actual types were varchar(10) and varchar(15). So to improve >> > that, we need to run these through the typmod formatting routines, not >> > just print the raw typmod numbers. I actually just combined that with >> > the check for the type itself. Also, there was no test covering this, >> > so I added one. See attached patches. >> >> +1. The error message is better. >> >> > >> > I did another investigation about whether this level of checking is >> > necessary. I think according to the letter of the SQL standard, the >> > typmods must indeed match. It seems Oracle does not check (the example >> > mentioned above came from an Oracle source). But I think it's okay to >> > keep the check. In PostgreSQL, it is much less common to write like >> > varchar(1000). And we can always consider relaxing it later. >> >> +1. >> >> Attached patch adds a couple more test statements. >> >> > >> > 2) I had it in my notes to consider whether we should support the colon >> > syntax for label expressions. I think we might have talked about that >> > before. >> > >> > I'm now leaning toward not supporting it in the first iteration. I >> > don't know that we have fully explored possible conflicts with host >> > variable syntax in ecpg and psql and the like. Maybe avoid that for >> now. >> > >> >> I was aware of ecpg and I vaguely remember we fixed something in ECPG >> to allow : in MATCH statement. Probably following changes in >> psqlscan.l and pgc.l >> -self [,()\[\].;\:\+\-\*\/\%\^\<\>\=] >> +self [,()\[\].;\:\|\+\-\*\/\%\^\<\>\=] >> >> Those changes add | after : (and not the : itself) so maybe they are >> not about supporting : . Do you remember what those are? >> >> However, I see that : in psql can be a problem >> #create table t1 (a int, b int); >> #create property graph g1 vertex tables (t1 key (a)); >> #select * from GRAPH_TABLE (g1 MATCH (a :t1) COLUMNS (a.a)); >> a >> --- >> (0 rows) >> >> #\set t1 blank >> #select * from GRAPH_TABLE (g1 MATCH (a :t1) COLUMNS (a.a)); >> ERROR: syntax error at or near "blank" >> LINE 1: select * from GRAPH_TABLE (g1 MATCH (a blank) COLUMNS (a.a))... >> >> > There was also a bit of an inconsistency in the presentation: The >> > documentation introduced the colon as seemingly the preferred syntax, >> > but ruleutils.c dumped out the IS syntax. >> > >> > (It was also a bit curious that some test code put spaces around the >> > colon, which is not idiomatic.) >> > >> >> That might have been just me trying to type one letter less and yet >> keeping the query readable. A column between variable name and the >> label is not always noticeable. >> >> > Attached is a patch that shows how to revert the colon support. It's >> > pretty simple, and it would be easy to add it back in later, I think. >> >> I agree that it's better not to support it now. It requires more work >> and it's optional. When we come to support it, we will need thorough >> testing. >> >> I spotted some examples that use : in ddl.sgml. >> <programlisting> >> SELECT customer_name FROM GRAPH_TABLE (myshop MATCH >> (c:customer)-[:has]->(o:"order" WHERE o.ordered_when = current_date) >> COLUMNS (c.name AS customer_name)); >> </programlisting> >> >> The query demonstrates that one can use label names in a way that will >> make the pattern look like an English sentence. Replacing : with IS >> defeats that purpose. >> >> As written in that paragraph, the labels serve the purpose of exposing >> the table with a different logical view (using different label and >> property names). So we need that paragraph, but I think we should >> change the example to use IS instead of :. Attached is suggested >> minimal change, but I am not happy with it. Another possibility is we >> completely remove that paragraph; I don't think we need to discuss all >> possible usages the users will come up with. >> >> The patch changes one more instance of : by IS. But that's straight >> forward. >> >> In ddl.sgml I noticed a seemingly incomplete sentence >> A property graph is a way to represent database contents, instead of >> using >> relational structures such as tables. >> >> Represent the contents as what? I feel the complete sentence should be >> one of the following >> property graph is a way to represent database contents as a graph, >> instead of representing those as relational structures OR >> property graph is another way to represent database contents instead >> of using relational structures such as tables >> >> But I can't figure out what was originally intended. >> >> I have squashed 0002 from my earlier patchset and your 3 patches into >> 0001. >> >> 0002 has extra tests mentioned above. It also removes "TODO: dubious >> error message" from a comment. I don't see anything dubious in the >> error message. I think this patch is safe to be merged into 0001. >> >> 0003 is changed to ddl.sgml. Those need a bit more work as described >> above. >> >> -- >> Best Wishes, >> Ashutosh Bapat >> >
v20251226-0002-SQL-PGQ-Add-PROPERTY_NAMES-graph-element-function.patch
Description: Binary data
v20251226-0003-SQL-PGQ-Support-multi-pattern-path-matching-in-GRAPH.patch
Description: Binary data
v20251226-0001-SQL-PGQ-Add-LABELS-graph-element-function.patch
Description: Binary data
