-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 ...and Doug Hughes! I meant to say, Richard Chycoski & Doug Hughes.
/me shuffles off to take a remedial typing class Cheers, - --Trey Quoth Trey Darley [06/29/2010 03:05 PM] : > I'm a bit embarrassed that it's taken me so long to respond to this thread > wherein I received many helpful suggestions. For the record, at least for > my use case Richard Chycoski's answer was the winner, which is to say that > tail+head significantly trumps sed. > > Many thanks to all but especially to Richard! > > Cheers, > --Trey > ++----------------------------------------------------------------------------++ > Trey Darley - Brussels > mobile: +32/494.766.080 > ++----------------------------------------------------------------------------++ > Quis custodiet ipsos custodes? > ++----------------------------------------------------------------------------++ > >> When I time sed for this (using a ~250K line log file as input), I get: >> >> time sed -n 10000,13000p mysqld.log > /tmp/file1 >> >> real 0m0.220s >> user 0m0.156s >> sys 0m0.044s >> >> When I use tail and head, I get: >> >> time ( tail -n+10000 mysqld.log | head -n 3001 > /tmp/file2 ) >> >> real 0m0.014s >> user 0m0.004s >> sys 0m0.010s >> >> So at least as far as CPU usage is concerned, I'd go with head and tail. >> Using sed may have more overhead because it's prime function is not to >> split files in this way. >> >> - Richard >> >> Trey Darley wrote: >>> Say you've got a simple ascii text file, say, 250,000 lines long. Let's >>> say it's a logfile. Suppose that you wanted to access an arbitrary range >>> of lines, say, lines 10,000 - 13,000. One way of doing this is: >>> >>> <snip> >>> sed -n 10000,13000p foobar.txt >>> </snip> >>> >>> Trouble is, the target systems I need to exec this on are ancient and >>> don't take very kindly to the io hammering this delivers. Can you >>> suggest >>> a better way of achieving this? >>> >>> As these *are* logs I'm dealing with I have already implemented rotation >>> frequency. That helps, but I'm still facing performance issues. I'm >>> specifically looking for input as to whether my sed-based approach can >>> be >>> improved. >>> >>> Because of extremely limited network bandwidth pulling the files off the >>> wire and processing them on more studly hardware isn't an option. Also, >>> I >>> cannot install any binaries on these remote systems. Standard POSIX >>> toolkit is all I've got to work with. :-/ >>> >>> Many thanks, y'all! >>> --Trey >>> ++----------------------------------------------------------------------------++ >>> Trey Darley - Brussels >>> mobile: +32/494.766.080 >>> ++----------------------------------------------------------------------------++ >>> Quis custodiet ipsos custodes? >>> ++----------------------------------------------------------------------------++ >>> >>> >>> >>> _______________________________________________ >>> Discuss mailing list >>> Discuss@lopsa.org >>> http://lopsa.org/cgi-bin/mailman/listinfo/discuss >>> This list provided by the League of Professional System Administrators >>> http://lopsa.org/ >>> >> >> > > > _______________________________________________ > Discuss mailing list > Discuss@lopsa.org > http://lopsa.org/cgi-bin/mailman/listinfo/discuss > This list provided by the League of Professional System Administrators > http://lopsa.org/ - -- ++----------------------------------------------------------------------------++ Kingfisher Operations Trey Darley - Principal landline: +1 / 404.455.1516 mobile: +32/494.766.080 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkwqxi0ACgkQQXaSM49tivATgQCfUvFyi7+2blzSlDh8uzCvK9Zg zyUAn0vHc1SGdU/UsO6euvNQe6HYR3xQ =+vNY -----END PGP SIGNATURE----- _______________________________________________ Discuss mailing list Discuss@lopsa.org http://lopsa.org/cgi-bin/mailman/listinfo/discuss This list provided by the League of Professional System Administrators http://lopsa.org/