-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

...and Doug Hughes! I meant to say, Richard Chycoski & Doug Hughes.

/me shuffles off to take a remedial typing class

Cheers,
- --Trey

Quoth Trey Darley [06/29/2010 03:05 PM] :
> I'm a bit embarrassed that it's taken me so long to respond to this thread
> wherein I received many helpful suggestions. For the record, at least for
> my use case Richard Chycoski's answer was the winner, which is to say that
> tail+head significantly trumps sed.
> 
> Many thanks to all but especially to Richard!
> 
> Cheers,
> --Trey
> ++----------------------------------------------------------------------------++
> Trey Darley - Brussels
> mobile: +32/494.766.080
> ++----------------------------------------------------------------------------++
> Quis custodiet ipsos custodes?
> ++----------------------------------------------------------------------------++
> 
>> When I time sed for this (using a ~250K line log file as input), I get:
>>
>> time sed -n 10000,13000p mysqld.log > /tmp/file1
>>
>> real    0m0.220s
>> user    0m0.156s
>> sys     0m0.044s
>>
>> When I use tail and head, I get:
>>
>> time ( tail -n+10000 mysqld.log | head -n 3001 > /tmp/file2 )
>>
>> real    0m0.014s
>> user    0m0.004s
>> sys     0m0.010s
>>
>> So at least as far as CPU usage is concerned, I'd go with head and tail.
>> Using sed may have more overhead because it's prime function is not to
>> split files in this way.
>>
>> - Richard
>>
>> Trey Darley wrote:
>>> Say you've got a simple ascii text file, say, 250,000 lines long. Let's
>>> say it's a logfile. Suppose that you wanted to access an arbitrary range
>>> of lines, say, lines 10,000 - 13,000. One way of doing this is:
>>>
>>> <snip>
>>> sed -n 10000,13000p foobar.txt
>>> </snip>
>>>
>>> Trouble is, the target systems I need to exec this on are ancient and
>>> don't take very kindly to the io hammering this delivers. Can you
>>> suggest
>>> a better way of achieving this?
>>>
>>> As these *are* logs I'm dealing with I have already implemented rotation
>>> frequency. That helps, but I'm still facing performance issues. I'm
>>> specifically looking for input as to whether my sed-based approach can
>>> be
>>> improved.
>>>
>>> Because of extremely limited network bandwidth pulling the files off the
>>> wire and processing them on more studly hardware isn't an option. Also,
>>> I
>>> cannot install any binaries on these remote systems. Standard POSIX
>>> toolkit is all I've got to work with. :-/
>>>
>>> Many thanks, y'all!
>>> --Trey
>>> ++----------------------------------------------------------------------------++
>>> Trey Darley - Brussels
>>> mobile: +32/494.766.080
>>> ++----------------------------------------------------------------------------++
>>> Quis custodiet ipsos custodes?
>>> ++----------------------------------------------------------------------------++
>>>
>>>
>>>
>>> _______________________________________________
>>> Discuss mailing list
>>> Discuss@lopsa.org
>>> http://lopsa.org/cgi-bin/mailman/listinfo/discuss
>>> This list provided by the League of Professional System Administrators
>>>  http://lopsa.org/
>>>
>>
>>
> 
> 
> _______________________________________________
> Discuss mailing list
> Discuss@lopsa.org
> http://lopsa.org/cgi-bin/mailman/listinfo/discuss
> This list provided by the League of Professional System Administrators
>  http://lopsa.org/

- -- 
++----------------------------------------------------------------------------++
Kingfisher Operations
Trey Darley - Principal
landline: +1 / 404.455.1516
mobile: +32/494.766.080
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkwqxi0ACgkQQXaSM49tivATgQCfUvFyi7+2blzSlDh8uzCvK9Zg
zyUAn0vHc1SGdU/UsO6euvNQe6HYR3xQ
=+vNY
-----END PGP SIGNATURE-----
_______________________________________________
Discuss mailing list
Discuss@lopsa.org
http://lopsa.org/cgi-bin/mailman/listinfo/discuss
This list provided by the League of Professional System Administrators
 http://lopsa.org/

Reply via email to