Re: Deleting Duplicate Lines one-liner

Paul Mon, 30 Jul 2001 10:18:55 -0700

--- David Blevins <[EMAIL PROTECTED]> wrote:
> Here is a one-liner I just wrote to delete duplicate lines in a file.
> 
> perl -ni.bak -e 'unless ($last eq $_){print $_};$last=$_;' theFile

That requires that it be sorted, doesn't it?

> Going with the TMTOWTDI credo, I was just curious if anyone knew of a
> better way.

For *S*M*A*L*L* files, I use a hash.

 perl -ni.bak -e 'unless ($hit{$_}){$hit{$_}++,print $_}' theFile

That preserves the order of nonduplicated lines, but eats a lot of
memory if the file is very big at all. 

If you don't want to preserve the order, try "sort -u theFile" 
( though that isn't Perl =o)

__________________________________________________
Do You Yahoo!?
Make international calls for as low as $.04/minute with Yahoo! Messenger
http://phonecard.yahoo.com/

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Deleting Duplicate Lines one-liner

Reply via email to