----- Original Message -----
From: Casey West <[EMAIL PROTECTED]>
To: M.W. Koskamp <[EMAIL PROTECTED]>
Cc: <[EMAIL PROTECTED]>; cherukuwada subrahmanyam <[EMAIL PROTECTED]>;
<[EMAIL PROTECTED]>
Sent: Wednesday, May 02, 2001 6:45 PM
Subject: Re: eliminating duplicate lines in a file
> On Wed, May 02, 2001 at 07:39:03PM +0200, M.W. Koskamp wrote:
> :
> : ----- Original Message -----
> : From: Paul <[EMAIL PROTECTED]>
> : To: cherukuwada subrahmanyam <[EMAIL PROTECTED]>;
<[EMAIL PROTECTED]>
> : Sent: Wednesday, May 02, 2001 7:08 PM
> : Subject: Re: eliminating duplicate lines in a file
> :
> :
> : >
> : > --- cherukuwada subrahmanyam <[EMAIL PROTECTED]> wrote:
> : > > Hi,
> : > > Iam reading flat text file of 100000 lines. Each line has got data
of
> : > > maximum 10 characters.
> : > > I want to eliminate duplicate lines and blank lines out of that
file.
> : > > i.e. something like sort -u in unix.
> : >
> : > Got plenty of memory? =o)
> : >
> : > open IN, $file or die $!;
> : > my %uniq;
> : > while(<IN>) {
> : > $uniq{$_}++;
> : > }
> : > print sort keys %uniq;
> : >
> : how about you?
> :
> : open FH, "lines.txt" || die $!;
> : my %uniq;
> : map{$uniq{$_}=1 and print $_ unless $uniq{$_} }<FH>;
> :
> : :o))
>
> While this is fun and amusing, it's not being very helpfull. At least
> provide an easy to understand explination of what your code is doing.
>
ok:
map iterates over a list, placing each element of the list in a locally
scoped $_.
So the code between the curly brackets will be executed for each element in
a list.
<FH> returns a list of lines when called in list context.
the code "$uniq{$_}=1 and print $_" will be executed, unless $uniq{$_}
contains a true value...
So for each line in the file $uniq{$_} will be set to one and the line will
be printed, unless $uniq{$_} already is true, which means the line has been
printed before.