Milan Jurik wrote:
> V Ä?t, 09. 07. 2009 v 15:21, Sean McGrath píše:
> >  With the coming ksh93 update 2 and it replacing several commands
> >   like wc, tail, head, join etc.  Theres a need to have a benchmark
> >   to measure at least before and after ksh93 update 2 change.
> >
> >  Roland and I were talking on irc last night about this.  We'll need
> >   to figure out a decent method of benchmarking these commands.
> 
> How is it possible that Roland discovers the responsible people
> everytime? :-)

Well... part of the secret is that I use a komodo dragon (preferably a
hungry one), a whips (wet, with salt) and a small egg... that way you
can get every information out of people (yes, yes, it's cruel&&unusual)
... =:-)

> >  So within the next few days we hope to work out a method for benchmarking 
> > ksh93
> >  This hopefully is a start of that discussion, rather than blindly writing
> >   adhoc timing scripts..
> >
> >  One way, suggested by Roland could be:
> >
> >    cmd = mkdir:
> >
> >     timex ksh93 -c 'rmdir "xyz" >/dev/null ; \
> >         for ((i=0 ; i < 1000 ; i++)) ; do /bin/mkdir -p "xyz" ; done'
> >
> >    that would benchmark the on disk mkdir. To use the builtin ksh93's mkdir,
> >    just remove the '/bin/'
> >
> >     timex ksh93 -c 'rmdir "xyz" >/dev/null ; \
> >         for ((i=0 ; i < 1000 ; i++)) ; do mkdir -p "xyz" ; done'
> 
> Do not test it as ksh93 command, but through the wrapper. So not ksh93
> -c 'tail', but /usr/bin/tail. That is the real impact.

Erm... that's not 100% correct. The test matrix should look like this:
[ old-version, new-version, ksh93-builtin-command ] * [ C-locale,
multibyte-locale ]

Explanation of terms:
- "old-version" means the old versions of the commands
- "new version" means the new versions of the commands
- "ksh93-buitin-commands" means running the loop within a ksh93 shell
using plain command names [1] [2]
- "C-locale" means something like $ LC_ALL=C ./test-script #
- "multibyte-locale" means something like $ LC_ALL=en_US.UTF-8
./test-script # - this is needed since the tools sometimes have
different codepaths for single-byte locales (like "C") and multibyte
locales (like "en_US.UTF-8" or "ja_JP.PCK")

[1]=(this is important to measure the impact for OpenSolaris/Indiana
where the default system shell is ksh93 (e.g. /usr/bin/sh, /sbin/sh,
/usr/bin/ksh, /usr/bin/ksh93 are all ksh93))
[2]=Note that a POSIX-conformant shell (like ksh93) will only use
builtin commands if you use the command name (e.g. "mkdir") and not the
full path (e.g. "/usr/bin/mkdir"). Or better: Using the full path makes
sure the shell always uses the non-builtin command from /usr/bin/

> >   Another method, using the above example could be to see how many times
> >   mkdir got called in a given time period.
> 
> The same amount of commands is good enough. Probably several times.
> 
> >   Other than basic benchmarking the environment too can be measured, i.e.
> >    the locale can have an impact, e.g. LC_ALL=C and LC_ALL=en_US.UTF-8
> 
> +1

Right - see test matrix above...

> >   So too to be looked at is the datasize used with commands, eg
> >    tail -X on a large or small file.  Small being about 256k or so and
> >    large being at least 1GB.
> 
> +1
> 
> File bigger than RAM should be good.

BTW: Some notes:
- "tail" _may_ now be a bit slower since it no longer uses |mmap()|
(which was one of the root causes for crashes (e.g. if the underlying
file shrinks while "tail" reads it))
- some commands like "join" should be faster now since it uses |mmap()|
(but we have an option to turn this behaviour off to avoid running into
the issue described with "tail" above)
- command startup time may be slightly higher since we now depend on two
more libraries (e.g. libcmd, libast) which need to be looked-up&&loaded.
This should be a bit compensated by the detail that the AST tools are
tuned more for large amounts of data
- please use tmpfs (e.g. /tmp) for reading/writing from/to files to
avoid getting noise from the disk I/O system

> >  For starters is there a definite list of those command we'd want to
> >   look at ? i.e. those being replaced by ksh93.
> 
> I think the the list is definitive and you can find it here (in Notes):
> 
> http://www.opensolaris.org/os/project/ksh93-integration/downloads/2009-07-02/
> 
> Optimal thing would be to test not only those which are replaced now,
> but also those which are already replaced and updated by this update.
> 
> Only usr/bin/print is new command, so we do not need to test it.
> 
> For testing all internal ksh93 commands, I would say no for now. It can
> be separate project, to do complete ksh93 benchmarking. But we should
> concentrate on update 2 for now.

Well, it's not tought as "ksh93 benchmark" (since it doesn't cover any
special shell features like string processing, math, array operations
etc.) - the idea was to figure out the impact on OpenSolaris/Indiana
where the use of builtin commands in the default system shell has direct
impact on system performance (e.g. at _least_ fewer |fork()|+|exec()|
calls).

----

Bye,
Roland

-- 
  __ .  . __
 (o.\ \/ /.o) roland.ma...@nrubsig.org
  \__\/\/__/  MPEG specialist, C&&JAVA&&Sun&&Unix programmer
  /O /==\ O\  TEL +49 641 3992797
 (;O/ \/ \O;)
_______________________________________________
perf-discuss mailing list
perf-discuss@opensolaris.org

Reply via email to