On Sat, 25 Sep 2010 21:05:13 -0700 (PDT), Xah Lee <xah...@gmail.com> wrote:
>here's a interesting toy list processing problem. > >I have a list of lists, where each sublist is labelled by >a number. I need to collect together the contents of all sublists >sharing >the same label. So if I have the list > >((0 a b) (1 c d) (2 e f) (3 g h) (1 i j) (2 k l) (4 m n) (2 o p) (4 q >r) (5 s t)) > >where the first element of each sublist is the label, I need to >produce: > >output: >((a b) (c d i j) (e f k l o p) (g h) (m n q r) (s t)) > [snip] >anyone care to give a solution in Python, Perl, javascript, or other >lang? am guessing the scheme solution can be much improved... perhaps >using some lib but that seems to show scheme is pretty weak if the lib >is non-standard. > Crossposting to Lisp, Python and Perl because the weird list of lists looks like Lisp or something else, and you mention other languages so I'm throwing this out for Perl. It appears this string you have there is actually list syntax in another language. If it is, its the job of the language to parse the data out. Why then do you want to put it into another language form? At runtime, once the data is in variables, dictated by the syntax, you can do whatever data manipulation you want (combining arrays, etc..). So, in the spirit of a preprocessor, given that the text is balanced, with proper closure, ie: ( (data) (data) ) is ok. ( data (data) ) is not ok. the below does simple text manipulation, joining like labeled sublists, without going into the runtime guts of internalizing the data itself. Internally, this is too simple. -sln ----------------- Alternate input: ( ( (0 a b) (1 c d) (2 e f ) ) (3 g h) (1 i j) (2 k l) (4 m n) (2 o p) (4 q r) (5 s t) ) ------------------ use strict; use warnings; my $input = <<EOI; ((0 a b) (1 c d) (2 e f) (3 g h) (1 i j) (2 k l) (4 m n) (2 o p) (4 q r) (5 s t)) EOI my $output = $input; my $regxout = qr/ ( (?: \( \s* [^()]+ \s* \) (\s*) )+ ) /x; $output =~ s{ $regxout } { my ( $list, $format ) = ( $1, $2 ); my ( %hseen, @order, $replace ); while ($list =~ /\(\s* (\S+) \s* (.+?) \s*\)/xsg) { if ( exists $hseen{$1} ) { $hseen{$1} .= " $2"; next; } push @order, $1; $hseen{$1} = $2; } for my $id (@order) { $replace .= "($hseen{$id}) "; } $replace =~ s/ $//; $replace . $format }xeg; print "Input -\n$input\n"; print "Output -\n$output"; __END__ Input - ((0 a b) (1 c d) (2 e f) (3 g h) (1 i j) (2 k l) (4 m n) (2 o p) (4 q r) (5 s t)) Output - ((a b) (c d i j) (e f k l o p) (g h) (m n q r) (s t)) -- http://mail.python.org/mailman/listinfo/python-list