FlashMX wrote: > Hi, Hello,
> I have a script that sorts the IDs with numerials first followed by > alphas. Everything works ok except that there are some IDs in the > input file that are duplicates. How can I "omit" and duplicate IDs > from the output file? > > Below is my script and a sample input and output file generated > from the below. Notice in the output file I have duplicate IDs for > the following: > > 2, 5, 15, ab, bcm fa > > I only want the output file to contain the first occurance and toss > the rest out. How can I do this? Use a hash. :-) > #!/usr/local/bin/perl > > require 5.000; > > my %tags = (); > > my $input = $ARGV[0]; > my $output = $ARGV[1]; > > open (FILE, "< $input") or die "cannot open $input: $!\n"; > open (OUTPUTFILE, "> $output"); You should verify that OUTPUTFILE is valid like you do for FILE. > chomp(my @lines = <FILE>); Why chomp the input when you are just adding the newlines on output? > my @chars = map { > my ($id) = m{<a id=(\w+)>}; > [ $_, $id, scalar $id =~ /^\d+$/ ]; > } @lines; my %seen; my @chars = grep !$seen{$_->[1]}++, map { my ($id) = m{<a id=(\w+)>}; [ $_, $id, scalar $id =~ /^\d+$/ ]; } @lines; > my @sorted_chars = sort { > $b->[2] <=> $a->[2] > or > ($a->[2] ? $a->[1] <=> $b->[1] : $a->[1] cmp $b->[1]) > or > $a->[0] cmp $b->[0] > } @chars; > my @result = map { $_->[0] } @sorted_chars; > print OUTPUTFILE "$_\n" for @result; > close OUTPUTFILE; > close FILE; You don't really need four different arrays: my %seen; print OUTPUTFILE map $_->[0], sort { $b->[2] <=> $a->[2] or ( $a->[2] ? $a->[1] <=> $b->[1] : $a->[1] cmp $b->[1] ) or $a->[0] cmp $b->[0] } grep !$seen{$_->[1]}++, map { my ( $id ) = /<a id=(\w+)>/; [ $_, $id, scalar $id =~ /^\d+$/ ]; } <FILE>; John -- use Perl; program fulfillment -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] <http://learn.perl.org/> <http://learn.perl.org/first-response>