Re: number of different lines in a file

Grant Edwards Fri, 19 May 2006 12:08:21 -0700

On 2006-05-19, Paul McGuire <[EMAIL PROTECTED]> wrote:

>> If the log has a lot of repeated lines in its original state then
>> running uniq twice, once up front to reduce what needs to be sorted,
>> might be quicker?
>>
>>  uniq log_file | sort| uniq | wc -l
>>
>> - Pad.
>
> Why would the second running of uniq remove any additional lines that
> weren't removed in the first pass?


Because uniq only removes _adjacent_ identical lines.

> For that matter, if this is a log file, wont every line have a timestamp,
> making duplicates extremely unlikely?

Probably.

-- 
Grant Edwards                   grante             Yow!  If our behavior is
                                  at               strict, we do not need fun!
                               visi.com            
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: number of different lines in a file

Reply via email to