> Having a common API for "per-recipient" things have long been on the todo > list. > > We've talked about it a couple times before; but the > requirements/needs/wishes never sunk in deep enough in my head to do > anything about it. So: How do you use this? > > My thoughts all along have been to just have some way to make a "find user > data" plugin that'll fetch a per-recipient blob of information that other > plugins can, uh, plug into. > > The big stumbling block is what the common infrastructure for dealing with > diverting needs of the recipients should be. What would we need in terms > of per-recipient header rewriting, post-DATA processing etc?
It is kid of a big scary can of worms. In large part because AFAICT, a number of shops including my own have grown their own solutions, often leaving things just as forked as we are. It seems that everybody's doing it in different, incompatible ways which each shop is fully pleased with and would find it difficult to move back way from, and really each method seems valid enough. It may be workable to try to support multiple different treatments, or it may be necessary to choose one and go with it. The latter would mean that some existing users might just remain forked, but if you made the right choices then probably new users would never bother to move away. There was a time when this was more at the forefront of my mind and I was interested enough to 'put together a proposal', I sent some ideas to the list at the time but now I'm not so sure we even follow those ideas in our own fork; we've moved beyond, and yet what we have now is still a work in progress. So to 'do it right the first time' some new thinking would have to take place. But FWIW, here's some about what we do now: - Qpsmtpd::Address has a config() method which passes through to plugins that call 'hook_user_config'. Note that I used to call this 'hook_rcpt_config' but really it's quite valid to call $txn->sender->config(...) when you've determined that this is an outbound message and you want to know something about the sender's preferences instead. The way I implemented this is a bit of a mess. At any rate I think I submitted a patch to do this sort of thing, but I might not have managed to stick with the review process long enough to answer everyone's concerns/questions. Or maybe it was accepted! I can't really remember ;) - Qpsmtpd::DSN is heavily modified so that it can be used to store information in Qpsmtpd::Address (which itself has methods to manipulate such things) about what we have decided we want to do with each recipient and why we have chosen to do it; e.g. ( Rejected => 'Spam' ). - Qpsmtpd::Transaction has been enhanced so that recipients() returns only the recipients that we have (thus far) decided we're going to be delivering to, and all_recipients() and rejected_recipients() have been added to give information about recipients that we have decided not to accept. Keeping these lists around is useful for logging but may also be necessary depending on how one decides to deal with the issue of "we've reached the end of DATA and want to reject this recipient and accept this one based on their different preferences, but we can only give one response". Obviously we respond with '250' in this case; as for the 'rejected' recipients, some choose to drop them silently; some have a 'quarantine' method in place and choose to quarantine for these recipients; some choose to bounce (this last method is the most reviled, but what can I say? I don't want to do it, but my lead developer does. And the situation is pretty rare anyhow). The content of these methods in our forked code might shed a little light on how they work with the DSN data stored in each recipient: sub recipients { my $self = shift; @_ and $self->{_recipients} = [...@_]; return () unless $self->{_recipients}; return grep { ! $_->dsn or $_->dsn->action ~~ [qw( Accepted Delivered Queued Quarantined )] } @{$self->{_recipients}} } sub rejected_recipients { my $self = shift; @_ and $self->{_rejected_recipients} = [...@_]; return () unless $self->{_recipients}; return grep { $_->dsn and ! $_->dsn->action ~~ [qw(Delivered Queued Quarantined)] } @{$self->{_recipients}} } sub all_recipients { my $self = shift; return () unless $self->{_recipients}; return @{$self->{_recipients}}; } - All of our per-recip-pref-aware post-data scanning plugins loop through each to-be-accepted recipient and determine what we want to do for each recipient. Then a single plugin afterward handles the results and responding to the client. So basically, an empty $txn->recipients() becomes a short-circuit for whatever post-data plugins are left in the mix. We actually don't do this with the DSN objects; we use a separate 'class' note to denote that something i 'spam', 'clean', 'whitelisted', etc. Then the last plugin sets the DSNs for each recipient. Maybe we should have figured a way to do that, idunno. - Each recipient object can optionally have its own $rcpt->notes('header') object (though this probably ought to be a header() method, really) and its own body object. In our case, we actually parse *all* MIME data with MIME::Parser, so this body object is a $rcpt->notes('mime_body'), but really this should probably be something more generic that has the same accessors as transaction bodies; perhaps there should even be a Qpsmtpd::Body and $txn->body_* should become $txn->body->* ? Some were just talking previously about the possibility of having MIME parsing as an option, so it may actually be worthwhile to officially have an $rcpt->mime_body (or Qpsmtpd::Body::mime()? as well which can be used if MIME has been parsed. - In postfix-queue (we haven't modified any of the other queue plugins but if this was the method chosen all of them would have to be modified), we loop through every recipient and queue separately for *each one*. This has become necessary for our own product, unfortunately, since we have unique headers for each recipient -- something that may very well wind up happening if you do things like adding a header with the SA score for each recipient and every recipient has different whitelists and other preferences. We used to have grouping though, and that would be easy enough to do -- queue separately for every recipient that has its own header or body object set, and then queue the rest that fell back to $txn->body and $txn->header all at once. One can imagine the increased overhead involved in queueing each recipient separately. The postfix queue is bigger, yes, but we have had other unexpected results, especially in instances where we are queueing to a remote postfix over even the local network. We have actually had to add a limit_recipients plugin that takes the size specified at MAIL FROM (lacking that it assumes something like 5 or 10 MB to be safe) and makes sure that we defer all recipients after we have gotten to the point where we would be transmitting over 200MB to Postfix. Before we did this, it was quite possible that we could sit around for so long queueing to postfix before we knew that every recipient was queued, that the client gave up on us and then of course tried again later and we wound up with duplicated messages. Sheesh! We haven't seen this since we tried to implement live delivery, which by the way we gave the heck up on long ago. That's all I can think of right now regarding what we're doing ourselves. Another big question is how to deal with people who don't really need per-recip handling and all the trouble it comes with. One thing I proposed at one point was that $txn->recipients() could just return a single meta-recipient so that plugins could be written assuming per-recip was enabled, and if it was turned off they would just run once through the loop. This does seem silly now though, what if we did $rcpt->dsn( Rejected => 'We don't like this recipient' ) for grep { $_->address =~ /bob/ } $txn->recipients. Maybe the thing to do is just go ahead and make the stock plugins aware of the possibility of either setting and deal with it themselves; e.g.: unless ( $self->qp->config('per_recip_is_on') ) { $uid = 0; # global return DENY, "We hate you" if $self->is_spam( $txn, $uid ); return DECLINED; } for my $rcpt ( $txn->recipients ) { # only scan things that previous plugins haven't figured out yet, # or that previous plugins want to quarantine # (we'll try get a more definite answer for those) next if ( $rcpt->class // 'quarantine' ) ne 'quarantine'; next unless $rcpt->config('enable_spam_scanning'); $rcpt->class('spam') if $self->is_spam( $txn, $rcpt->notes('uid') ); } That $txn->recipient loop, by the way, is pretty much how we do things in our forked plugins. Then the very last post-data plugin does something like this: for my $rcpt ( $txn->recipients ) { $rcpt->dsn( Rejected => 'Spam' ) if $rcpt->class eq 'spam'; $rcpt->dsn( Quarantined => 'Spam' ) if $rcpt->class eq 'quarantine'; $rcpt->dsn( Queued => 'Clean' ); } if ( $txn->recipients ) { # there some recips left to queue or quarantine # response() returns something like (OK, '250 Queued!') return Qpsmtpd::DSN->new(Queued => 'Clean')->_response; } else { # no recipients left to be accepted, let's reject # showing off another Qpsmtpd::DSN method return DENY, ($txn->rejected_recipients)[0]->dsn->smtp_text; } What's troubling to me is that the existence of a per-recip and non-per-recip mode with fundamental differences in how we handle things seems so very much like the existing 'async daemon' and 'everything else' targets. Done wrong, we could end up with four total targets to write plugins for, and more 'wait, does this work with per-recipient mode? i don't know i don't use that yet but i hear it's awesome' kinda like we have with async, or worse a plugins/per_recip folder. But per-recip is not _that_ huge a difference as async vs. blocking, so maybe it's much ado about nothing. Perhaps it would even be good to use plugin inheritance as (I just discovered recently) many async plugins to, to avoid actually duplicating that much code and allow the 'basic' plugins to be that much simpler... Hope That Helps ;) -Jared