Responses are below.. I'm doing some testing with this new version and will let you know how it goes.. Things are a bit crazy over here so it may take a few days to get you something useful back.
I was excited to see someone playing with my plugin so I worked some tonight on integrating the changes in the modified plugin that Ed posted. I've attached the very latest iteration with *some* of the changes, and a diff showing which I've applied. I have some questions on the remaining differences, if anyone (especially Ed) has the time and inclination to discuss then great, otherwise I'll probably just use my best judgement and post another iteration with all the changes integrated that make sense to me. Quoting the remaining diff of what I have *not* yet integrated: -require 'Qpsmtpd/tld_lists.pl'; +# use Data::Dumper; +require '/home/smtpd/qpsmtpd/plugins/lib/tld_lists.pl'; I assume this change is specific to your local installation I was using Data::Dumper to do some testing with how the $zones was being defined. We have a subscription with SURBL and was getting some really weird results when using our defined hostname for queries. +$Data::Dumper::Terse = 1; # don't output names where feasible +$Data::Dumper::Indent = 0; our ( %firstlevel_tlds, %uribl_secondlevel_tlds, %uribl_thirdlevel_tlds, %surbl_secondlevel_tlds, %surbl_thirdlevel_tlds ); @@ -234,6 +238,8 @@ for ( [ 2 => 'sc' ], [ 4 => 'ws' ], [ $z->{label} = "$z->{label} ($descr)"; delete $z->{masks}; } +#print "Dump #2"; +#print Dumper($zones); I assume this Data::Dumper stuff is leftover debug cruft - push @lists, { label => "$list->{label} ($extra)", + push @lists, { label => "$list->{label}", action => $z->{masks}->{$_}->{action} || $z->{action} }; This code is drastically changed now, I'm pretty sure in the latest iteration this change isn't needed any longer Testing this now and will let you know what I find out. sub scan_mime_body { my ( $self, $part ) = @_; + $self->log(LOGDEBUG, "uribl: Scanning Mime Body"); scan_mime_body() is now a recursive sub (I think it was a bug that it wasn't before...), so this logging could get out of hand. How about: @@ -770,8 +770,13 @@ $self->{check_headers} = 'all' ? $txn->header->tags : split /,/, $self->{check_headers} if $self->{check_headers}; - $txn->notes('mime_body') ? $self->scan_mime_body( $txn->notes('mime_body') ) - : $self->scan_body( $txn ); + if ( $txn->notes('mime_body') ) { + $self->log(LOGDEBUG, "uribl: Scanning MIME Body"); + $self->scan_mime_body( $txn->notes('mime_body') ) + } else { + $self->log(LOGDEBUG, "uribl: Scanning Plain Text Body"); + $self->scan_body( $txn ); + } if ( ! $self->{started_queries} and ! $self->{shortened} ) { $self->log(LOGINFO, "uribl: No URIs found in mail"); return 0; - return unless $part->effective_type ~~ [qw( text/plain text/html )]; + return unless $part->effective_type =~ [qw( text/plain text/html )]; + $self->log(LOGDEBUG, "uribl: Scanning Mime Body - text-html"); I don't think you really wanted '=~ [...]'; '~~ [...]' is a perl 5.10 smart match, I'm not really sure *what* '=~ [...]' would come to. Perhaps for the sake of < 5.10, it should lose the smart match and become something like: my $type = $part->effective_type; return if $type ne 'text/plain' and $type ne 'text/html'; $self->log(LOGDEBUG, "uribl: Scanning MIME $type part"); Hrmm.. I don't remember changing this bit, or it may have been a typo. @@ -489,7 +497,27 @@ sub scan_body { $self->log(LOGDEBUG, "uribl: Scanning Plain Text Body"); $txn->body_resetpos; my $line; - $self->find_uris($line) while ( $line = $txn->body_getline ); + my @qp_continuations; + + while ( $line = $txn->body_getline ) { + chomp $line; + if ( $line =~ /(.*)=$/ ) { + push @qp_continuations, $1; + next; + } elsif ( @qp_continuations ) { + $line = join('', @qp_continuations, $1); + @qp_continuations = (); + } + + $self->find_uris($line); + } + + if ( @qp_continuations ) { + $self->log(LOGINFO, "uribl: WARNING: scan_body exiting with line continuations left. Bad Email?"); + $line = join('', @qp_continuations, $line); + @qp_continuations = (); + $self->find_uris($line); + } } Just to clarify -- is this needed for scan_body() to even work? Is it tested? (we have hardly tested scan_body() since we basically always have a mime_body) This allows for quoted-printable body scanning where continuing lines end with an = sign. This is tested on our side and does work. For us this is used way more often than mim-boby. @@ -576,20 +604,19 @@ sub find_uris { # used by async plugin ) //xo ) { # parameterized redirect - my $addr = $1; + $addr = $1; next if $addr =~ /@/; - my $rev; I understand what you're doing here, trying to re-use the existing $addr and $rev rather than using new localized variables... I'd suggest keeping the localized variables though, so that if anything is ever added after this for loop, $addr and $rev aren't tainted. Perhaps the localized vars should be renamed to, say, $redir_addr and $redir_rev for the sake of clarity? - my $extra_host; if ( my $host = hostname( $addr ) ) { $addr = $host; # dash and underscore are allowed anywhere in subdomains, but the dash seems to # also be commonly used as a delimiter in parameterized redirects - $extra_host = hostname( $host ) if $host =~ s/^-.+-//; + my $extra_host = hostname( $host ) if $host =~ s/^-.+-//; Notice that 'my $extra_host' here is scoped within the 'if' block, and will immediately go out of scope... } else { ( $addr, $rev ) = ip( $addr ); } next unless $addr; my ( $path ) = $p =~ /^\/(.+)([hH][tT][tT][pP])?/; + my $extra_host; I think you probably want to leave the $extra_host declaration where it was, even if you get rid of the $addr and $rev localization. As it stands, $extra_host will never be anything but undef for all intents and purposes. I think this was your code.. Don't remember changing it. + # Check if we should run or not + my $options = $txn->notes('plug_options'); + return DECLINED if $self->qp->connection->notes('whitelisthost'); + return DECLINED if $txn->notes('whitelistsender'); Is this to do with your local installation, or is this standard QP stuff that I've lost touch with? This is part of our local implementation / changes that give us per-domain settings + per domain + per user whitelisting. - $self->log(LOGINFO, sprintf( "lookups: %2d zone, %2d A, %2d NS, %2d shortener" - . " matches: %2d time: %.6f", - ( map { $self->{queries_completed}->{$_} } - qw( uri a ns shortener ) ), - scalar @{ $self->{matches} || [] }, - time() - $self->{start_time} )); +# $self->log(LOGINFO, sprintf( "lookups: %2d zone, %2d A, %2d NS, %2d shortener" +# . " matches: %2d time: %.6f", +# ( map { $self->{queries_completed}->{$_} } +# qw( uri a ns shortener ) ), +# scalar @{ $self->{matches} || [] }, +# time() - $self->{start_time} )); Perhaps this should be LOGDEBUG? Umm.. I think you had it as LOGINFO and I just didn't change it. We run LOGDEBUG 24x7 so doesn't matter to me. -Jared