Hi Ashutosh, Thanks for the new patch set.
On Thu, Aug 7, 2025 at 6:47 PM Ashutosh Bapat <ashutosh.bapat....@gmail.com> wrote: > > Hi Amit, > Thanks for your comments > > On Wed, Jul 23, 2025 at 2:08 PM Amit Langote <amitlangot...@gmail.com> wrote: > > > > Hi Ashutosh, > > > > On Tue, Jul 22, 2025 at 8:40 PM Ashutosh Bapat > > <ashutosh.bapat....@gmail.com> wrote: > > > Here's the patchset rebased on the latest master. > > > > Thanks for the update. I was going through the patch series today and > > had a few comments on the structure that might help with reviewing and > > eventually committing, whoever ends up doing that. :) > > > > I noticed that some functions, like graph_table_property_reference(), > > are introduced in one patch and then renamed or removed in a later > > one. It’s easier to review if such refactoring is avoided across > > patches -- ideally, each patch should introduce code in its final > > intended form. That also helps reduce the number of patches and makes > > each one more self-contained. > > > > One of the later patches (0004) collects several follow-up changes. > > While it fixes a real bug -- a crash when GRAPH_TABLE is used inside > > plpgsql due to a conflict with the columnref hook -- it also includes > > incidental cleanups like switching to foreach_node(), updating > > comments, and adjusting function signatures. Those would be better > > folded into the patches that introduced the relevant code, rather than > > grouped into a catch-all at the end. That keeps each patch focused and > > easier to review -- and easier to merge when committing. > > > > A general principle that might help: if you're refactoring existing > > code, a standalone preliminary patch makes sense. But if you're > > tweaking something just added in the same series, it’s usually better > > to squash that into the original patch. The rename from > > graph_table_property_reference() to transformGraphTablePropertyRef() > > may be a fair exception since it reflects a shift prompted by the bug > > fix -- but most other adjustments could be folded in without loss of > > clarity. > > > > I understand the intent to spell out each change, but if the details > > are incidental to the overall design, they don’t necessarily need to > > be split out. Explaining the reasoning in the thread is always > > helpful, but consolidating the patches once the design has settled > > makes things easier for both reviewers and committers. > > 0001 is almost the same patch that Peter posted almost a year ago. > Each of 0002 onwards patches contains logically grouped changes. I > have changed 0001 minimally (so that it continues to apply, compile > and pass regression). I think this arrangement will make it easy for > Peter to review the changes. It will help him understand what all > changed since 0001 and each change in its own context. That has led to > a lot of overlap between 0002 and 0001 which I think should be > squashed together even now. But I would like to know what Peter thinks > - what would make his review easier. This arrangement might also help > Junwang, who has reviewed some patches, to continue his incremental > review. > > I have rearranged the patches 0002 to 0014 so that the same change > isn't revised in multiple patches. Hope that helps. > > I am also attaching a single patch as well to this email, so that > newer reviewers can review the changes as a whole. > > SQL_PGQ_20250807.zip is separate patches zipped together. > SQL_PGQ_one_patch_20250807.zip is a single diff with changes from all > the patches squashed together. It's still zipped to avoid the email > being held for moderation. > > > > > Finally, attaching the patches directly, with versioned names like > > v8-000x, instead of zipping them helps. Many folks (myself included) > > will casually skip a zip file because of the small hassle of unzipping > > just to read a patch. I once postponed reading such a patch and didn’t > > get back to it for quite a while. :) > > I used to attach those as separate patches till [1], but after that > the total size of the patchset grew so much that my emails used to get > held back for moderation. So I started attaching zip files. The > increase in size is mostly due to the 0014 patch. If we think that > it's not needed or can be trimmed down, we will be able to attach the > patchset as separate attachments. > > Here's what changed between the last patchset and this > > 0001 has some TODOs, FIXMEs and XXXs. I and Junwang had fixed some of > them earlier. This patchset fixes some more. Below is my analysis of > each of them. > + > + <!-- TODO: multiple path patterns in a graph pattern (comma-separated) --> > + > + <!-- TODO: quantifiers --> > > I think we should remove these TODOs since we are not going to support > these cases in this patchset. We will add the documentation when we > implement these features. > > +/* > + * 15.2 > + * PG_DEFINED_LABEL_SETS view > + */ > + > +-- TODO > + > + > +/* > + * 15.3 > + * PG_DEFINED_LABEL_SET_LABELS view > + */ > + > +-- TODO > + > + > +/* > + * 15.4 > + * PG_EDGE_DEFINED_LABEL_SETS view > + */ > + > +-- TODO > + > +/* > + * 15.6 > + * PG_EDGE_TRIPLETS view > + */ > + > +-- TODO > + > +/* > + * 15.15 > + * PG_VERTEX_DEFINED_LABEL_SETS view > + */ > + > +-- TODO > > All these views are related to the defined label set. IIUC, defined > label set is a set of labels which is shared by multiple edges or > vertices of a property graph. Please note it's the whole set shared; > not just individual labels in the set. In that sense, I think, the > labels associated with each edge or vertex table in a property graph > form a defined label set. That way the property graph element's OID > becomes the defined label set's identifier. I am not sure if this view > has any practical use for tabular property graphs. It's not clear to > me what these views will contain. Since they are not essential for > SQL/PGQ functionality, I have not implemented them. > > + /* > + * Now check number and names. > + * > + * XXX We could provide more detail in the error messages, but that would > + * probably only be useful for some ALTER commands, because otherwise it's > + * not really clear which label definition is the wrong one, and so you'd > + * have to construct a rather verbose report to be of any use. Let's keep > + * it simple for now. > + */ > > Agree with XXX. I think the code is good enough as is. We can leave > XXX there in case we expect someone to improve it in future. Otherwise > we should remove XXX as well. > > + > + rel = table_open(PropgraphLabelPropertyRelationId, RowShareLock); > + ScanKeyInit(&key[0], > + Anum_pg_propgraph_label_property_plppropid, > + BTEqualStrategyNumber, F_OIDEQ, > + ObjectIdGetDatum(propoid)); > + scan = systable_beginscan(rel, InvalidOid /* FIXME */ , > + true, NULL, 1, key); > > I think the FIXME is to add the correct index OID. > pg_propgraph_label_property_label_prop_index contains plppropid as a > key but it's not the first column. So probably some other, yet not > defined, index is expected here? > > +ALTER PROPERTY GRAPH g3 ALTER VERTEX TABLE t3 DROP LABEL t3l3; > +ALTER PROPERTY GRAPH g3 DROP VERTEX TABLES (t2); -- fail (TODO: > dubious error message) > +ERROR: cannot drop vertex t2 of property graph g3 because other > objects depend on it > +DETAIL: edge e1 of property graph g3 depends on vertex t2 of property graph > g3 > +HINT: Use DROP ... CASCADE to drop the dependent objects too. > > I don't see what's dubious about this error message. edge e1 connects > t2 and t1. Dropping t2 would leave e1 dangling; error is preventing > it. Looks similar to how we prevent a referenced table being dropped. > Or is it expected that e1 be dropped without specifying cascade when > t2 is dropped since e1 has no existence without t2? > > Also the patchset fixes the CI failure from the previous patchset. > > The patches are reordered slightly. Each patch contains a commit > message that explains the changes in that patch. > > [1] > https://www.postgresql.org/message-id/caexhw5sm+zgvzr1428fsdhuwc-fu2-6zqr5j91klh6vzawy...@mail.gmail.com > > > > > > -- > Best Wishes, > Ashutosh Bapat I have some review comments, and hope some of them are helpful. 1. doc/src/sgml/ddl.sgml +<programlisting> +CREATE PROPERTY GRAPH myshop + VERTEX TABLES ( + products LABEL product, + customers LABEL customer, + orders LABEL order + ) + EDGE TABLES ( + order_items SOURCE orders DESTINATION products LABEL contains, + customer_orders SOURCE customers DESTINATION orders LABEL has + ); +</programlisting> order is a reserved keyword, so the following example will fail, we should not give a wrong example in our document. 2. doc/src/sgml/information_schema.sgml + <title><literal>pg_element_table_key_columns</literal></title> + + <para> + The view <literal>pg_element_key_columns</literal> identifies which columns + are part of the keys of the element tables of property graphs defined in + the current database. Only those property graphs are shown that the + current user has access to (by way of being the owner or having some + privilege). + </para> + <title><literal>pg_element_table_properties</literal></title> + + <para> + The view <literal>pg_element_table_labels</literal> shows the definitions + of the properties for the element tables of property graphs defined in the + current database. Only those property graphs are shown that the current + user has access to (by way of being the owner or having some privilege). + </para> the <title> and the <para> doesn't match. 3. doc/src/sgml/queries.sgml + <para> + A path pattern thus matches a sequence of vertices and edges. The + simplest possible path pattern is +<programlisting> +() +</programlisting> + which matches a single vertex. The next simplest pattern would be +<programlisting> +()-[]->() +</programlisting> + which matches a vertex followed by an edge followed by a vertex. The + characters <literal>()</literal> are a vertex pattern and the characters + <literal>-[]-></literal> are an edge pattern. + </para> the description seems wrong, when a user writes (), it should match all vertices in a property graph, for example: explain SELECT customer_name FROM GRAPH_TABLE (myshop MATCH () COLUMNS (1 AS customer_name)); [local] zjw@postgres:5432-73666=# explain SELECT customer_name FROM GRAPH_TABLE (myshop MATCH () COLUMNS (1 AS customer_name)); QUERY PLAN ------------------------------------------------------------------ Append (cost=0.00..89.40 rows=3960 width=4) -> Seq Scan on customers (cost=0.00..18.50 rows=850 width=4) -> Seq Scan on orders (cost=0.00..32.60 rows=2260 width=4) -> Seq Scan on products (cost=0.00..18.50 rows=850 width=4) (4 rows) 4. doc/src/sgml/ref/create_property_graph.sgml +<programlisting> +CREATE PROPERTY GRAPH g1 + VERTEX TABLES ( + v1 LABEL foo PROPERTIES (x AS a * 2) LABEL bar PROPERTIES (x AS a * 2) + ) ... +</programlisting> + but this would not: +<programlisting> +CREATE PROPERTY GRAPH g1 + VERTEX TABLES ( + v1 LABEL foo PROPERTIES (x AS a * 2) LABEL bar PROPERTIES (x AS a * 10) + ) ... +</programlisting></para> The expr after PROPERTIES should be (a * 2 AS x) 5. src/backend/catalog/sql_features.txt +G034 Path concatenation YES SQL/PGQ required Do we support path concatenation? +G070 Label expression: label disjunction NO SQL/PGQ required I think this should be a YES? 6. src/backend/commands/tablecmds.c The description RenameRelation and RemoveRelations should adapt property graph, below are diffs I suggest: /* - * Execute ALTER TABLE/INDEX/SEQUENCE/VIEW/MATERIALIZED VIEW/FOREIGN TABLE + * Execute ALTER TABLE/INDEX/SEQUENCE/VIEW/MATERIALIZED VIEW/FOREIGN TABLE/PROPERTY GRAPH * RENAME */ ObjectAddress @@ -4210,8 +4210,8 @@ RenameRelation(RenameStmt *stmt) /* * Grab an exclusive lock on the target table, index, sequence, view, - * materialized view, or foreign table, which we will NOT release until - * end of transaction. + * materialized view, foreign table or property graph, which we will NOT + * release until end of transaction. /* * RemoveRelations * Implements DROP TABLE, DROP INDEX, DROP SEQUENCE, DROP VIEW, - * DROP MATERIALIZED VIEW, DROP FOREIGN TABLE + * DROP MATERIALIZED VIEW, DROP FOREIGN TABLE, DROP PROPERTY GRAPH 7. src/backend/nodes/nodeFuncs.c exprLocation should have an arm for T_GraphPropertyRef? I suggest: @@ -1811,6 +1811,9 @@ exprLocation(const Node *expr) case T_PartitionRangeDatum: loc = ((const PartitionRangeDatum *) expr)->location; break; + case T_GraphPropertyRef: + loc = ((const GraphPropertyRef *) expr)->location; + break; 8. src/backend/rewrite/rewriteGraphTable.c patch 0002 + /* + * Label expressions from two element patterns need to be + * conjucted. Label conjuction is not supported right now typo, should be conjuncted and conjunction. 9. The line length of some code is quite long, which isn't ideal, but it's not a major issue. It can be formatted before being committed. BTW, the patch set seems not to apply to the current master. I will go another round of review of the test cases, thanks. -- Regards Junwang Zhao