Hi,
I've been able to get the problem down to a minimum (I think). It has
been a quite arduous work... And I assure you that the machines that
were "hanging" the connections where doing VERY STRANGE THINGS :S
The client seemed to be cutting the connection just after the DATA
command (as the logs had revealed), without sending lines (because the
spooling to disk message was not appearing). Sending DATA and cutting
connection was not reproducing the hung connection. After a couple of
hours trying to reproduce, an idea strucj me, maybe I should not send
the complete DATA. So here the script to hang the connection:
#!/usr/bin/perl
my $to = $ARGV[0] or die "Usage: $0 email";
use Net::SMTP::TLS;
my $mailer = new Net::SMTP::TLS(
'HOST_TO_HANG',
Hello => 'localhost',
NoTLS => 0
);
$mailer->mail('');
$mailer->to('[EMAIL PROTECTED]');
$mailer->{sock}->write('DATA');
$mailer->{sock}->close();
print "do a top on your SMTP server\n";
Note that I write DATA without sending the terminator. (It's a shame
Test::SMTP does not support STARTTLS... I'm working on it...).
After that I decided to see which was the plugin doing the harm... If in
the test script you set NoTLS => 1, then the process does not hang. So
it seems tls... Will it only be TLS? Deactivated all plugins and ran the
test: doesn't hang. So the thing will be interaction between plugins...
Finally the set of plugins got reduced to tls + custom_plugin.
custom_plugin is a plugin that we use to do Pop Before SMTP
authentication. It connects to a MySQL db to see if the connecting IP is
in the rely list... I've shaved the plugin down to the bare minumum to
get the process to hang... Just do a DBI->connect.
The config used is:
tls /.../xxx.pem /.../xxx.pem /.../rootca.crt
dbi_connector
relay_all
rcpt_ok
relay_all just does
sub hook_rcpt {
return (OK);
}
rcpt_ok is the standard one
dbi_connector is:
#!perl -w
sub hook_rcpt {
require DBI or die "Can't load DBI";
my ($self, $transaction, $recipient, %param) = @_;
my $dbh =
DBI->connect('DBI:mysql:database=xxx;host=localhost;port=3306',
'xxx','xxx') or
$self->log(LOGDEBUG, 'Could not connect ' . DBI->errstr()) ;
return (DECLINED);
}
It doesn't matter if the DBI connects or does'nt connect successfully.
Just run the test script and... you get a qp child doing lots of writes
per second to a broken pipe...
I've tried with sqlite3, not connecting to a db, and it doesn't
happen... looks like it could have to do with dbd::mysql and how it
cleans up?
We are using QP version 0.40 on Debian Etch. Perl modules are standard
Etch ones.
I've been trying to profile the code, to see the call path... but I'm
having problems... Any pointers on how to debug or to profile the code
on a spawned qp child?
Can somebody try to reproduce this? Are there any pointers on using DBI
in your plugins (maybe the first one is: DON'T :p)?
Isn't it strange that it's only reproducable with a non-terminated DATA?
Any thoughts? Any Ideas?
Thanks in advance,
Jose Luis Martinez
CAPSiDE
[EMAIL PROTECTED]