I wouldn't do this with recursion; plain old iteration is your friend (yes, WITH RECURSIVE is actually iterative, not recursive...)
The algorithm goes like this: 1. Extend your graph relation to be symmetric and transitive. 2. Assign a integer group id to each node. 3. Repeatedly join the node list to the (extended) relation, updating each node's group id to be the minimum of the group ids of every node it touches. 4. Stop when the group ids stop changing. Here's some example code, using your data: DROP SCHEMA IF EXISTS test CASCADE; CREATE SCHEMA test; SET SEARCH_PATH TO test; CREATE TABLE graph(key1 TEXT, key2 TEXT); INSERT INTO graph VALUES ('a', 'x'), ('a', 'y'), ('b', 'w'), ('c', 't'), ('x', 'a'), ('y', 'a'), ('y', 'z'), ('z', 'y'), ('t', 'c'), ('w', 'b'), ('w', 'd'), ('d', 'w'); DO $$ DECLARE prev INT = 0; curr INT; BEGIN CREATE TABLE rel AS SELECT key1, key2 FROM graph UNION SELECT key2, key1 FROM graph UNION SELECT key1, key1 FROM graph UNION SELECT key2, key2 FROM graph; CREATE TABLE group_ids AS SELECT key, ROW_NUMBER() OVER (ORDER BY key) AS group_id FROM ( SELECT key1 AS key FROM graph UNION SELECT key2 FROM graph ) _; SELECT SUM(group_id) INTO curr FROM group_ids; WHILE prev != curr LOOP prev = curr; DROP TABLE IF EXISTS min_ids; CREATE TABLE min_ids AS SELECT a.key, MIN(c.group_id) AS group_id FROM group_ids a INNER JOIN rel b ON a.key = b.key1 INNER JOIN group_ids c ON b.key2 = c.key GROUP BY a.key; UPDATE group_ids SET group_id = min_ids.group_id FROM min_ids WHERE group_ids.key = min_ids.key; SELECT SUM(group_id) INTO curr FROM group_ids; END LOOP; DROP TABLE IF EXISTS rel; DROP TABLE IF EXISTS min_ids; END $$; SELECT * FROM group_ids; Hope it helps, Matthew > I have a collection of relationship rows of the form > > Table: graph > key1 varchar > key2 varchar > > A row of the form ('a','b') indicates that 'a' and 'b' are related. > The table contains many relationships between keys, forming several > disjoint sets. All relationships are bi-directional, and both > directions are present. I.e. the table contains a set of disjoint > graphs specified as node pairs. > > For example the set of values > > key1 key2 > ----- ----- > a x > a y > b w > c t > x a > y a > y z > z y > t c > w b > w d > d w > > defines three disjoint groups of connected keys: > > a x y z > c t > b w d > > What I would like to achieve is a single SQL query that returns > > group key > ----- --- > 1 a > 1 x > 1 y > 1 z > 2 c > 2 t > 3 b > 3 w > 3 d > > I don't care about preserving the node-to-node relationships, only > the group membership for each node. > > I've been playing with "WITH RECURSIVE" CTEs but haven't had any > success. I'm not really sure how to express what I want in SQL, and > it's not completely clear to me that recursive CTEs will help here. > Also I'm not sure how to generate the sequence numbers for the groups > > > -- > Sent via pgsql-general mailing list (pgsql-general@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general > -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general