May be this helps perl -lne "print if ++$D{$_} == 1" address.txt
regards 2008/2/5, Rob Dixon <[EMAIL PROTECTED]>: > > boll wrote: > > I'm trying to write a script to remove duplicate e-mail addresses from a > > list. > > I'd like some help understanding... > > 1. Why does it remove all but one of the duplicate lines? > > 2. How can I fix it? > > > > Thanks for any advice, > > John > > ------------------------------- > > #!/usr/bin/perl > > use warnings; > > use strict; > > > > open ALLNAMES, "emails.txt" or die "File: infile failed to open: $!\n"; > > my @allnames = <ALLNAMES>; > > > > my %seen = (); > > my @unique = grep { ! $seen{ $_ }++ } @allnames; > > > > print "@unique"; > > > > close ALLNAMES or die "cannot close infile"; > > ----------------------------------------- > > here's a small test file with fourteen lines, but only ten unique lines: > > > > [EMAIL PROTECTED] > > [EMAIL PROTECTED] > > [EMAIL PROTECTED] > > [EMAIL PROTECTED] > > [EMAIL PROTECTED] > > [EMAIL PROTECTED] > > [EMAIL PROTECTED] > > [EMAIL PROTECTED] > > [EMAIL PROTECTED] > > [EMAIL PROTECTED] > > [EMAIL PROTECTED] > > [EMAIL PROTECTED] > > [EMAIL PROTECTED] > > [EMAIL PROTECTED] > > > > ------------------------------- > > I would guess that your output includes the last line of the file when > you don't expect it to. You are retaining the newline character at the > end of each line. If the final line doesn't have a newline at the end it > will appear different from the ones that do, and so will be listed in > the output. To fix this just > > my @allnames = <ALLNAMES>; > chomp @allnames; > > and then > > print "$_\n" foreach @unique; > > at the end. > > HTH, > > Rob > > > > -- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > http://learn.perl.org/ > > >