Milan Jurik wrote: > V Ä?t, 09. 07. 2009 v 15:21, Sean McGrath pÃÅ¡e: > > With the coming ksh93 update 2 and it replacing several commands > > like wc, tail, head, join etc. Theres a need to have a benchmark > > to measure at least before and after ksh93 update 2 change. > > > > Roland and I were talking on irc last night about this. We'll need > > to figure out a decent method of benchmarking these commands. > > How is it possible that Roland discovers the responsible people > everytime? :-)
Well... part of the secret is that I use a komodo dragon (preferably a hungry one), a whips (wet, with salt) and a small egg... that way you can get every information out of people (yes, yes, it's cruel&&unusual) ... =:-) > > So within the next few days we hope to work out a method for benchmarking > > ksh93 > > This hopefully is a start of that discussion, rather than blindly writing > > adhoc timing scripts.. > > > > One way, suggested by Roland could be: > > > > cmd = mkdir: > > > > timex ksh93 -c 'rmdir "xyz" >/dev/null ; \ > > for ((i=0 ; i < 1000 ; i++)) ; do /bin/mkdir -p "xyz" ; done' > > > > that would benchmark the on disk mkdir. To use the builtin ksh93's mkdir, > > just remove the '/bin/' > > > > timex ksh93 -c 'rmdir "xyz" >/dev/null ; \ > > for ((i=0 ; i < 1000 ; i++)) ; do mkdir -p "xyz" ; done' > > Do not test it as ksh93 command, but through the wrapper. So not ksh93 > -c 'tail', but /usr/bin/tail. That is the real impact. Erm... that's not 100% correct. The test matrix should look like this: [ old-version, new-version, ksh93-builtin-command ] * [ C-locale, multibyte-locale ] Explanation of terms: - "old-version" means the old versions of the commands - "new version" means the new versions of the commands - "ksh93-buitin-commands" means running the loop within a ksh93 shell using plain command names [1] [2] - "C-locale" means something like $ LC_ALL=C ./test-script # - "multibyte-locale" means something like $ LC_ALL=en_US.UTF-8 ./test-script # - this is needed since the tools sometimes have different codepaths for single-byte locales (like "C") and multibyte locales (like "en_US.UTF-8" or "ja_JP.PCK") [1]=(this is important to measure the impact for OpenSolaris/Indiana where the default system shell is ksh93 (e.g. /usr/bin/sh, /sbin/sh, /usr/bin/ksh, /usr/bin/ksh93 are all ksh93)) [2]=Note that a POSIX-conformant shell (like ksh93) will only use builtin commands if you use the command name (e.g. "mkdir") and not the full path (e.g. "/usr/bin/mkdir"). Or better: Using the full path makes sure the shell always uses the non-builtin command from /usr/bin/ > > Another method, using the above example could be to see how many times > > mkdir got called in a given time period. > > The same amount of commands is good enough. Probably several times. > > > Other than basic benchmarking the environment too can be measured, i.e. > > the locale can have an impact, e.g. LC_ALL=C and LC_ALL=en_US.UTF-8 > > +1 Right - see test matrix above... > > So too to be looked at is the datasize used with commands, eg > > tail -X on a large or small file. Small being about 256k or so and > > large being at least 1GB. > > +1 > > File bigger than RAM should be good. BTW: Some notes: - "tail" _may_ now be a bit slower since it no longer uses |mmap()| (which was one of the root causes for crashes (e.g. if the underlying file shrinks while "tail" reads it)) - some commands like "join" should be faster now since it uses |mmap()| (but we have an option to turn this behaviour off to avoid running into the issue described with "tail" above) - command startup time may be slightly higher since we now depend on two more libraries (e.g. libcmd, libast) which need to be looked-up&&loaded. This should be a bit compensated by the detail that the AST tools are tuned more for large amounts of data - please use tmpfs (e.g. /tmp) for reading/writing from/to files to avoid getting noise from the disk I/O system > > For starters is there a definite list of those command we'd want to > > look at ? i.e. those being replaced by ksh93. > > I think the the list is definitive and you can find it here (in Notes): > > http://www.opensolaris.org/os/project/ksh93-integration/downloads/2009-07-02/ > > Optimal thing would be to test not only those which are replaced now, > but also those which are already replaced and updated by this update. > > Only usr/bin/print is new command, so we do not need to test it. > > For testing all internal ksh93 commands, I would say no for now. It can > be separate project, to do complete ksh93 benchmarking. But we should > concentrate on update 2 for now. Well, it's not tought as "ksh93 benchmark" (since it doesn't cover any special shell features like string processing, math, array operations etc.) - the idea was to figure out the impact on OpenSolaris/Indiana where the use of builtin commands in the default system shell has direct impact on system performance (e.g. at _least_ fewer |fork()|+|exec()| calls). ---- Bye, Roland -- __ . . __ (o.\ \/ /.o) roland.ma...@nrubsig.org \__\/\/__/ MPEG specialist, C&&JAVA&&Sun&&Unix programmer /O /==\ O\ TEL +49 641 3992797 (;O/ \/ \O;) _______________________________________________ perf-discuss mailing list perf-discuss@opensolaris.org