RE: HELP!

Bob Showalter Wed, 01 May 2002 06:57:54 -0700

> -----Original Message-----
> From: Gary Stainburn [mailto:[EMAIL PROTECTED]]
> Sent: Wednesday, May 01, 2002 10:53 AM
> To: Bob Showalter
> Cc: Perl Help
> Subject: Re: HELP!
> 
> 
> On Wednesday 01 May 2002 2:26 pm, Bob Showalter wrote:
> [snip]
> > > {$array{$_}++} foreach @array;
> >
> > They are the same as far as the effect on %array. But you
> > shouldn't use map() in a void context, as it constructs a
> > result list which is then thrown away. So your iterative
> > approach is better.
> >
> > Better still IMO is:
> >
> >    { my %seen; @array = grep { !$seen{$_}++ } @array; }
> 
> So, this foreach's @array, passing $_ through grep, which 
> looks to see if 
> it's contained in %seen, returning $_ if it isn't - i.e. on 
> first seeing $_.  
> it performs the actions $seen{$_}++ in the process.
> 
> right?


Yep!

grep *evaluates* the block once for each element of the list
after the block. $_ is set to the current list element as
the block is being evaluated. grep *returns* a list consisting
of those elements in the input list for which the block
evaluates as "true".

So, for each element of the list, we evaulate:

   !$seen{$_}++

The expression $x++ increments $x but returns the value of
$x before it was incremented (unlike ++$x, which returns the
value after incrementing).

So, if the first element of the list is "cool", we have
$seen{cool}++, which sets $seen{cool} to 1, but returns
undef, since that was the value before incrementing.

The next time we see "cool", $seen{cool}++ will set
$seen{cool} to 2, and return 1 (value before incrementing).

So the values of $seen{cool} through the loop will be:

   First time:  undef
   Second time: 1
   Third time:  2

and so forth.

But notice that the full expression is actually !$seen{cool},
which is using the logical negation operator. The values
of this expression will be:

   First time:  !(undef)  = 1 (true)
   Second time: !1        = '' (false)
   Third time:  !2        = '' (false)

and so forth. It will only be true the *first* time any
given value of $_ is seen.

Now remember, grep returns a list of the elements in the
input list for which the expression was true. If the
expression is false, grep does not include that element
in the output list. This means that when "cool" is seen
for the first time, !$seen{++} will be true, so "cool"
(the element from the input list) is added to the output
list. However, when cool appears again, !$seen{++} will
be false (see above), so cool is not added to the output
list a second time.

The result is that the output list contains the first instance
of any value from the input list; i.e. duplicates are discarded.

The { my %seen; ... } construct is used to create a lexical
scope for the %seen hash to live in. This avoids conflicts with
any other %seen hash lying around and ensures that %seen is
empty prior to the grep(), which is necessary for it to work
properly.

BTW, this techinque is found in the FAQ's:

   perldoc -q 'How can I remove duplicate elements from a list or array?'

n.b. the last line:

   "But perhaps you should have been using a hash all along, eh?"

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

RE: HELP!

Reply via email to