Re: removing duplicates

obdulio santana Tue, 05 Feb 2008 11:41:06 -0800

May be this helps

perl -lne "print if  ++$D{$_} == 1" address.txt


regards



2008/2/5, Rob Dixon <[EMAIL PROTECTED]>:
>
> boll wrote:
> > I'm trying to write a script to remove duplicate e-mail addresses from a
> > list.
> > I'd like some help understanding...
> > 1. Why does it remove all but one of the duplicate lines?
> > 2. How can I fix it?
> >
> > Thanks for any advice,
> > John
> > -------------------------------
> > #!/usr/bin/perl
> > use warnings;
> > use strict;
> >
> > open ALLNAMES, "emails.txt" or die "File: infile failed to open: $!\n";
> > my @allnames = <ALLNAMES>;
> >
> > my %seen = ();
> > my @unique = grep { ! $seen{ $_ }++ } @allnames;
> >
> > print "@unique";
> >
> > close ALLNAMES or die "cannot close infile";
> > -----------------------------------------
> > here's a small test file with fourteen lines, but only ten unique lines:
> >
> > [EMAIL PROTECTED]
> > [EMAIL PROTECTED]
> > [EMAIL PROTECTED]
> > [EMAIL PROTECTED]
> > [EMAIL PROTECTED]
> > [EMAIL PROTECTED]
> > [EMAIL PROTECTED]
> > [EMAIL PROTECTED]
> > [EMAIL PROTECTED]
> > [EMAIL PROTECTED]
> > [EMAIL PROTECTED]
> > [EMAIL PROTECTED]
> > [EMAIL PROTECTED]
> > [EMAIL PROTECTED]
> >
> > -------------------------------
>
> I would guess that your output includes the last line of the file when
> you don't expect it to. You are retaining the newline character at the
> end of each line. If the final line doesn't have a newline at the end it
> will appear different from the ones that do, and so will be listed in
> the output. To fix this just
>
>    my @allnames = <ALLNAMES>;
>    chomp @allnames;
>
> and then
>
>    print "$_\n" foreach @unique;
>
> at the end.
>
> HTH,
>
> Rob
>
>
>
> --
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
> http://learn.perl.org/
>
>
>

Re: removing duplicates

Reply via email to