Re: IO::Multiplex enhancement

Douglas Webb Sat, 01 Jun 2002 14:38:13 -0700

Cool... I've had to go further, though.

Using the version with my patch, I'm able to easily handle 1500 hits/sec spread over 30 already-connected sockets.
I've got about 50 hits/sec coming from each of those sockets (a limitation I've hit too... I've got a usleep call between each write to the socket, but it doesn't seem able to sleep less than 0.02 seconds.)
This is on a 40-cpu Sun box running Solaris 2.8. I'm not sure of the model number, but it's typical of my companies production server farm.

Here's the problem... I need 1000 sockets sending 1 hit/sec, not 30 sockets sending 50 hits/sec. In the former case, the cpu usage skyrockets, presumably because IO::Multiplex is using select instead of poll. So, I went ahead and switched to IO::Poll, after upgrading to perl 5.6.1, and like my previous change I found that not a whole lot had to be modified.

At this point I'm still performance testing my changes. I need to look at the timeout function, and how it's implemented. There is a loop over every filehandle to check for timeouts that need to be executed, and that loop appears to be consuming about a third of the cpu time. That's really bad, since I've only got a single timeout set on my listen socket. I'm pretty confident that I can find a more efficient way to deal with timeouts; I should have that done next week. I'll send you another diff when I get it.

Doug.

Rob Brown wrote:

Perl modules dudes:

We are having big problems with this LIRAZ guy who still will
not release control over IO::Multiplex because he cannot be
contacted.  He is not involved with this version of IO::Multiplex
either.  Can you reassign cpan author from LIRAZ to BBB for me?
Or how can we fix this situation?


Douglas:

Looks fantastic.  I'm reviewing your patch now.  I've only been
able to get around 50 connections/sec because of some large CPU
overhead.  I'm starting to maintain it a little more, but I'm
not the original author.  Bruce is.  I think your patch will help.
You should see it in version 1.04.  I'll let him know, too.

Thanks,

Rob Brown
CPANID: BBB

On Thu, 30 May 2002, Douglas Webb wrote:

I've been working on a project for my company which requires a daemon
process to take in ~1 request/second from 1000 other processes on the
same machine, and send those 1000 requests/sec through some parsing and
then on to Oracle.

The architecture I decided on was to have a single select-based process
that would watch 1000 sockets for input, and to pass that input to 20 or
so sockets which are connected to forked-off subprocesses. The
subprocesses would handle the parsing and DBI stuff. This lets me spend
a lot more time dealing with Oracle than I otherwise could.

I looked in Network Programming with Perl, and figured that somebody
must have already put the examples in there onto CPAN. That's how I
found IO::Multiplex.

Overall, I'm happy with IO::Multiplex; the testing I've done so far
tells me that it'll be able to handle my needs reliably. However, I
discovered that it uses way more cpu time than it needs to, due to the
use of Tie::RefHash.

That module lets you use a reference as a hash key, but pretend that
it's still a reference. Normally, the reference is turned into a string,
and the string is used as the key.
When I ran my program through Devel::DProf, I found that a huge amount
of time was being spent in FETCH, NEXTKEY, EXISTS in the Tie::RefHash
module.

After searching through IO::Multiplex, I found that there were only a
couple of places where the hash keys are being used as references (to
filehandles.) The rest of the accesses just used them as keys, so the
Tie::RefHash was a waste. I added a new key to $self, _handles, to map
from the stringified filehandle reference to the real filehandle
reference, and I made a few other changes to use the new key only where
needed. This gave me about a 20x reduction in the cpu usage under the
same request/sec load.

Here's the diff for my changes; there aren't many of them.
I don't think there are any drawbacks to my approach.

Let me know if you have any questions.
Doug.


bash-2.03$ diff -C 1 Multiplex.orig.pm Multiplex.doug.pm
*** Multiplex.orig.pm   Mon Feb  4 13:14:03 2002
--- Multiplex.doug.pm   Thu May 30 11:04:02 2002
***************
*** 161,163 ****
      use IO::Multiplex;
-     use Tie::RefHash;

--- 161,162 ----
***************
*** 271,273 ****
  use Data::Dumper;
- use Tie::RefHash;
  use Carp qw(cluck);
--- 270,271 ----
***************
*** 301,304 ****
                         _fhs         => {},
                         _listen      => {}  } => $package;
-     tie %{$self->{_fhs}}, "Tie::RefHash";
      $self;
--- 299,302 ----
                         _fhs         => {},
+                        _handles     => {},
                         _listen      => {}  } => $package;
      $self;
***************
*** 353,354 ****
--- 351,353 ----
      $self->{_fhs}{$fh}{fileno} = fileno($fh);
+     $self->{_handles}{$fh} = $fh;
      tie *$fh, "MVModule::MVmux::Handle", $self, $fh;
***************
*** 374,375 ****
--- 373,375 ----
      delete $self->{_fhs}{$fh};
+     delete $self->{_handles}{$fh};
      untie *$fh;
***************
*** 518,520 ****

!     grep(!$self->{_fhs}{$_}{listen}, keys %{$self->{_fhs}});
  }
--- 518,520 ----

!     grep(!$self->{_fhs}{$_}{listen}, values %{$self->{_handles}});
  }
***************
*** 563,565 ****

!         foreach my $fh (keys %{$self->{_fhs}}) {
              # Avoid creating a permanent empty hash ref for "$fh"
--- 563,565 ----

!         foreach my $fh (values %{$self->{_handles}}) {
              # Avoid creating a permanent empty hash ref for "$fh"
***************
*** 797,798 ****
--- 797,799 ----
      delete $self->{_fhs}{$fh};
+     delete $self->{_handles}{$fh};
      untie *$fh;
***************
*** 800,802 ****
      $obj->mux_close($self, $fh) if $obj && $obj->can("mux_close");
-     delete $self->{_fhs}{$fh};
  }
--- 801,802 ----

Re: IO::Multiplex enhancement

Reply via email to