I've found an odd performance issue that I cannot explain.  I'm using
socketpairs to communicate with multiple rfork(RFPROC) processes.
Initially, I used a seperate socketpair to communicate requests to each
process, with locking in the parent to synchronize access to each client.
I determined that by using a single socketpair, I could save on all the
socketpairs, and also perhaps improve performance by allowing more requests
to be dispatched than there were processes to handle them.  Whenever a
worker process finished one request, it would immediately be able to start
the next, without having to wait for the parent to receive the response and
reprocess the request structures.

Unfortunately, I've found that having a group of processes reading from a
group of socketpairs has better performance than having them all read from
a single socketpair.  I've been unable to determine why.  I've reduced the
problem down to a simple program, included as an attachment (sorry about
that).  The results of two runs of the program:

ganja% time ./commtest --single
./commtest --single  0.00s user 0.66s system 15% cpu 4.132 total
ganja% time ./commtest --multi
./commtest --multi  0.00s user 0.46s system 68% cpu 0.675 total

Note that in the --single case, the system time rises a bit - but the
wallclock time rises a _lot_.  At first I thought this was a variant on the
"thundering herd" problem, but the CPU times taken don't seem to bear this
out.

Any ideas?  Running under 3.2-RELEASE on an SMP machine, though I saw the
same results on 3.4-RELEASE.

Thanks,
scott

commtest.c

Reply via email to