I have two group operations. One is inside the CTE ( union select src_id, dest_id, min(dist) ), another is outside the CTE. Do you mean that even the grouping inside the CTE will be calculated only after the CTE has been calculated?
Thank you very much:) Best, Jing On Tue, Nov 5, 2013 at 5:04 AM, Albe Laurenz <laurenz.a...@wien.gv.at>wrote: > Jing Fan wrote: > > I use following command to get a shortest-path query: > > > > with recursive paths( src_id, dest_id, dist) as( > > select n1,n2,1 > > from nodes > > union > > select src_id, dest_id, min(dist) > > from ( select paths.src_id as src_id, nodes.n2 as dest_id, > paths.dist+1 as dist > > from paths, nodes > > where paths.dest_id=nodes.n1 > > and paths.src_id<>nodes.n2 > > ) as Temp > > group by src_id, dest_id > > ) > > select paths.src_id, paths.dest_id, min(dist) > > from paths > > group by 1,2; > > > > It seems that this query goes into infinite loops and finally run out of > disk space. However, I testrf > > every iteration seperately and found that it will converge after 3-4 > iterations. I wonder where is the > > problem. Could anyone help with it? The attatchment is the test data. > > The attached test data suggest different table and column names, > but I assume that you mean "edge" when you write "nodes" and > that columns "n1" and "n2" are really "src_id" and "dest_id". > > The endless loop occurs because there are loops in your > directed graph, but you only exclude circles where beginning > is equal to end. > > To quote three lines from your attachment: > INSERT INTO edge (src_id, dest_id) VALUES (1739, 6236); > INSERT INTO edge (src_id, dest_id) VALUES (6236, 1739); > INSERT INTO edge (src_id, dest_id) VALUES (3384, 6236); > > Your recursive WITH clause (CTE) will now happily produce: > src_id | dest_id | dist > --------+---------+------ > 3384 | 6236 | 1 > 3384 | 1739 | 2 > 3384 | 6236 | 3 > 3384 | 1739 | 4 > 3384 | 6236 | 5 > 3384 | 1739 | 6 > 3384 | 6236 | 7 > 3384 | 1739 | 8 > 3384 | 6236 | 9 > 3384 | 1739 | 10 > 3384 | 6236 | 11 > and so on to infinity, which is why you will eventually run > out of space. > > The grouping (and any other processing in your main query) > takes place only after the CTE has been calculated, so while > your query would in theory return the desired result, it does > so only after calculating infinitely many intermediate rows. > > One solution I can envision is to put a limit on the distance, > for example the total count of nodes. > > Yours, > Laurenz Albe >