On Feb 5, 2008 1:18 AM, boll <[EMAIL PROTECTED]> wrote:
> I'm trying to write a script to remove duplicate e-mail addresses from a
> list.
> I'd like some help understanding...
> 1. Why does it remove all but one of the duplicate lines?
snip

Because that is what the code says to do.  It says to print any line
it hasn't seen before.  It isn't looking forward to see if the line
may exist again in the list.

snip
> 2. How can I fix it?
snip

Well, I think you should think about it for a second.  Do you really
want to throw away any lines that have duplicates.  For instance:

[EMAIL PROTECTED]
[EMAIL PROTECTED]
[EMAIL PROTECTED]

When I look at that I want

[EMAIL PROTECTED]
[EMAIL PROTECTED]

not

[EMAIL PROTECTED]

But if you really want the later then you will need to keep a running
count of how many times you have seen an email and only print out the
ones you have seen once:

#!/usr/bin/perl

use strict;
use warnings;

my %seen;
$seen{$_}++ while <>;
print grep { $seen{$_} == 1 } keys %seen;

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/


Reply via email to