In the message dated: Tue, 18 Nov 2014 21:42:16 -0800,
The pithy ruminations from Tracy Reed on 
<Re: [lopsa-tech] What programs do sysadmins write?> were:
=> On Tue, Nov 18, 2014 at 06:50:24PM PST, Mark McCullough spake thusly: >

[SNIP!]

=> 
=> > It's also much faster than bash.
=> 
=> I can't think of a single time in over 20 years of using bash that the
=> execution speed of bash code has made the slightest difference.

HPC schedulers (Sun GridEngine and the many derivatives since Oracle took
over, perhaps others like Torque, Maui) allow system admins to write
programs, typically scripts, to verify jobs before they are submitted
to the cluster. These Job Submission Verifiers are a real-world example
of the tremendous performance difference between shell scripts (*sh
languages) and other scripting languages (perl, python, tcl, go).

A JSV is called for each job submitted by a user, and typically checks
the job parameters, supplies default scheduler parameters, and adjusts
some user-selected options. In our HPC environment, it's not uncommon
for a user to submit 5~20K jobs in a loop...and we've got a small HPC
cluster compared to other places... Each of those jobs must be verified
by the JSV before the scheduler places it in the queue. The scheduler has
a default timeout for the JSV of 20 seconds -- it must return (verifying,
modifying, or rejecting the proposed job) in that time, or the job will
not be accepted.

With a JSV written in bash, the performance under heavy load was so poor
that the 'verifier' step would often timeout, preventing jobs from being
submitted to the cluster.

Switching the JSV from bash to perl, with no particular effort in
optimizing the perl, resulted in a speed up of ~20x. This had a noticible
effect, reducing the load on the cluster and eliminating the timeouts.

See:
        
https://blogs.oracle.com/templedf/entry/performance_considerations_for_jsv_scripts
        
http://gridengine.eu/index.php/grid-engine-internals/194-client-side-jsv-performance-comparison-and-the-winner-is-go-golang-2014-03-15

This might be a cautionary tale about the use of *sh scripts as 'daemons',
rather than one-time processes.



In answer to the original query, based on what I've written in the past
year that's been formal enough to document, commit to our internal version
control repository, etc., I've written about 35 scripts in bash (calling
sed & awk frequently) & perl, and a little Tk/TCL. If I've [re]written
a bash script a couple of times, or it's gotten to be more than a few
screenfuls of code, that's a signal that perhaps it'd be better in perl or
something else. About 1~2x/year I'll write a wrapper in C, almost always
to run setuid, supplying fixed or sanitized inputs to an existing program.

My software is usually 10~250 LOC (or ~3x more lines when comments,
formatting, whitespace, etc. are included).

About 3/4 of my code is intended for system administrator use only,
whether from the command-line or via cron, and about 1/4 is for end-users.

Mark

=> 
=> -- Tracy Reed
=> 
_______________________________________________
Tech mailing list
Tech@lists.lopsa.org
https://lists.lopsa.org/cgi-bin/mailman/listinfo/tech
This list provided by the League of Professional System Administrators
 http://lopsa.org/

Reply via email to