per-user usage metering

Ricardo Signes Wed, 08 Jun 2011 11:19:54 -0700

Hi, Postfix.  Long-time fan, first time poster.

I need to keep track of per-user use of our SASL-authenticated outbound relay,
and to reject mail from users who are exceeding their allowed usage.  The
records of their usage need to be accessible to me elsewhere over extended
durations, although their specific format isn't a huge concern.


There is an existing system in place for this, but it's got a serious race
condition in it, and I'm not 100% sure that my idea to deal with the problem is
a great one.

Right now, users authenticate with SASL, and that's fine.

The mail then goes through a unix socket policy service via
smtpd_sender_restrictions.  This looks up the account (based on the
sasl_username) and then checks their recent usage in a usage database.  If they
are over usage, it returns a 450.  If they are not over usage, it signals
success by prepending a header.  Mail with that header is routed to another
transport by header_checks.

This other transport is responsible for performing a content spam check.  If
the message is spam, it is sent to an uninteresting destination.  If it is not,
the message (size, recipients, spam-check score, etc.) is recorded in the usage
database and the message is re-injected to its final destination.

The race condition is simple:  the smtpd can accept a lot of mail before the
logging transport can write to the usage database, meaning users can bypass the
usage limits.

My first moronic attempt to fix this was to move some of the logging to the
policy service, and to communicate the record id via the added header to the
logging transport, so it could update the record with the spam check score.  I
had forgotten that the policy service was being queried once *per recipient*,
which the obvious problem that each message was logged multiple times.  I
didn't want to try coordinating based on instance id (incrementing the
recipient count each time, etc.) -- and anyway, there is another problem:  the
mail might pass all the recipient restrictions and then fail during DATA.

My current thinking is this:

  1. a fast, idempotent policy service will check usage at rcpt time so that
     we can avoid accepting DATA if the user is over quota; it will signal
     acceptance with "OK"

  2. an end_of_data_restriction will log the recipient count, size, etc; it
     will signal acceptance by PREPENDing the record identifier

  3. the logging transport will still exist, and will do the content checks
     and update the record with the spam score

I'm not sure whether I am worried about the logging done by end_of_data
resulting in logging messages that for some reason do not reach the logging
transport.  In that case, I may mark the records as "pending," with the logging
transport marking them "accepted," and another job purging pending records
regularly.

Does this make sense?  Is it a terrible idea?  Is this all already covered by
some simple interface I have yet to discover?

-- 
rjbs

per-user usage metering

Reply via email to