Re: fork costs...

Kyle Sallee Sat, 21 Oct 2006 11:31:58 -0700

Comments made inline...


On 10/21/06, Ralf Wildenhues <[EMAIL PROTECTED]> wrote:

Well.  In fact I have a working patch on my disk to separate link mode
in another scriptlet libtool-link.  It speeds up script execution on my
GNU/Linux by about 20%.  I can post it if people desire, but I'm against
applying it before 2.0, because I don't think the improvement is worth
possible issues now.


Ah, good thinking.  I was expecting that there could be some gain
if the script being executed is more specific to the desired task
and thus more compact.

> > I understand that you are strongly against using forks
> > where the equivilent can be coded without it.

Not so.  Forking in order to invoke faster tools or to reduce the
complexity order can speed up things significantly, as you also pointed
out.  We are however against any deliberate changes that do not provably
improve things.


That is where it is difficult.
3 forks might be a few miliseconds slower than
doing it with bash when the loop iteration is tiny.
But when the loop iterration is vast
then the 3 forks will be faster.
Therefore, no one can flatly state that the revised
method is faster than the traditional method.
In some cases it is incredibly faster,
but in some cases mildly slower.
The question is whether consensus can be achieved
that making the code slightly slower for the least complex
cases and vastly faster for more complex cases is better
than leaving the way it is where it is fastest for least
complex cases and vastly slower for complex cases.
Leave it optimized for best case scenario
or optimize it for worst case scenario?

> > I would like to know more about how costly forks are
> > on the platforms where it is seriously slow, please.
> > 2432 invocations of /bin/sed during execution of libtool
> > has got to slow it like a sucking chest wound.

GNU/Linux on current hardware can do these in a fraction of a second.


Yes... that is why an average cost of 3 forks to construct
each variable list would be obviously better on glibc/linux
than complex and slow executon shell code that contains
the possibility of an unknown amount of sed executions within a loop.

> > Here is my conjecture...

We need code, not conjectures.  Be expected to have it ripped apart or
rejected for portability issues, and to be demanded test cases for any
changes that aren't obviously not changing semantics.  But really it
is not possible to judge optimization issues based on talk.


In this case an hour of discussion may be worth more than 10 of coding.
The code would look slow and concise, but execute fast.
I would not want anyone, not even myself,
to put forth the effort to write the code if there is no chance in
having it accepted.

The speed gains for converting some of the complex shell code
to piped coreutil commands is so obvious when working with
large variables or lists that I can only expect that there is a
strong opposition to it that has prevented it from being done already.
I hope the above statement is not misinterpreted as arrogant.
When processing data I tend to do so in pipes and therefore
I am accustomed to the excellent speed gains that external
commands piped together gain me over using for-do loops.

For example the shell code for identifying match words in a string
by using a case statement is clever, but slow, slow compared to
using sort | uniq -d when doing a signifigant number of comparisons.
Obviously, the consensus until now is to keep using the shell code.
But after looking at the trace of libtool execution
I am trying to figure out why keep it that way?

> > Do you suppose anyone but me would agree that optimizing libtool
> > for best performance on large tasks is better than having libtool
> > optimized for best performance on simple tasks and horrifically
> > slow performance on large tasks?

Definitely, yes.


Ah good.
Then, then we both agree that the forks from the use of sed, grep,
sort, and uniq
are acceptible costs for making the execution speed on the more
complex tasks equivilent to the execution speed on the simple tasks.

> > libtool is currently optimized for very small and easy tasks.

Not necessarily.  I'd rather say most parts have never seen much
optimization.


I can not be harsh at first glance.
However, some if statements exist inside of
loops where the outcome of the if statement
would always be the same.
Obviously such code is not optimized
otherwise the if statement would be on the outside
and separate loops would exist in the "then"
and "else" sections of the if statement.

But such trivial optimizations would not yield the performance
increase of eliminating variable snowballing and whittling,
and eliminating the use of case statements to find duplicates,
and eliminating the complex and slow executing bash code in
places where an average of 3 forks could accomplish the same task.
Therefore, I can understand if statements inside of loops
being overlooked for optimization.

I would not suggest revising and optimizing libtool if the
effort put into it did not gain at least a magnitude of speed.
Afterall, what is the point if there is not a visually measurable improvement?
I want to see lengthy invocations of libtool
scroll so fast that I can not begin to read
them even on a 233 Mhz pentium.  :)


_______________________________________________
http://lists.gnu.org/mailman/listinfo/libtool

Re: fork costs...

Reply via email to