I have a need for a Perl routine to do a topological sort (a la the Unix tsort command), and was unable to locate one in CPAN. (The closest I got was a perl version of the tsort command in the Perl Power Tools collection by Jeffrey Haemer, but is not set up as a subroutine, and does not identify the cycles when found).
Anyhow, it was easy enough to write a routine to meet my requirements, with some peaking at the GNU tsort C code, and I was considering contributing the module to CPAN. In interests of making the proposed module more useful, was interested if anyone has feedback re the interface. Still working on a good set of tests for the module and docs, so it is a good point at which to make changes in the API. I was proposing a module Algorithm::TopologicalSort. Currently consists of a single exported routine tsort, which accepts as input a list pairwise orderings, and returns as list consisting of the number of cycles found in the input, followed by a 'solution' to the topological sort. The solution consists of nodes of the inputted set in an order consistent with the given pairwise orderings. If cycles were detected, it is reflected in the cycle count, and at an appropriate place in the returned solution, instead of a single node there is a ref to an anon array consisting of the nodes comprising the cycle. E.g., my ($cyc,@ordered) = tsort 'a'=>'b','b'=>'c','c'=>'d', 'd'=>'e','c'=>'b'; would return $cyc=1 and @ordered = ( 'a', ['b','c'], 'd', 'e' ); (actually, the order of 'b' and 'c' in anon array is unspecified). The only option right now is a global $AlwaysUseAnonArrays which if set will cause even singleton nodes to be put in an anon array, so in above example will return @ordered = ( ['a'], ['b', 'c'], ['d'], ['e']) This is primiarily intended for use if the nodes are not simple scalars (I have not done anything else to make it work with other than simple scalars, but believe Perl's stringification of references will allow such to work.) The algorithm currently only does breadth first sorts, debating as to whether should add a depth first option (e.g. whether 'a'=>'e','c'=>'d', 'a'=>'b' should return 'a','c','b','d','e' or 'a','b','e','c','d'). In particular, looking for any advice/suggestions on making the input/output more Perlish. One idea was an alternate input format consisting of a list of lists, so the 1st example could be simplified to ($cyc, @ordered) = tsort [ 'a','b','c','d','e' ] , ['c','b']; with the first anon array being an abbreviated form for 'a'=>'b', 'b'=>'c', etc. If we insist on simple scalar nodes, this could even be overloaded (and/or allow a mixture of two formats, e.g., allow omission of anon array brackets around ['c','b'] in above). Any feedback appreciated. Tom Payerle Dept of Physics [EMAIL PROTECTED] University of Maryland (301) 405-6973 College Park, MD 20742-4111 Fax: (301) 314-9525