--- Begin Message ---
Package: tracker.debian.org
Severity: wishlist
Hi,
Please find attached the 3 commit patches proposed to solve the
following Trello task:
-----
The current design involves forking a new process
(./manage.pytracker_dispatch) for each incoming email. This is
problematic on multiple levels:
we ran out of memory on the test machine when 200 mails were delivered
in the same second... (and exim can't be configured to rate-limit
this)
mails can get lost/bounced back if the process fails for some reason
So we want to change this so that mails are delivered to a local
Maildir and we have a daemon watching this directory (possibly with
inotify so that we have no delay) and processing mails with a
configurable number of workers.
-----
These commits can also be found or tested on branch "maildir_daemon"
of the git.domainepublic.net/distro-tracker repo.
Should you have any remarks, please tell me.
Best,
Joseph
From eef1c8832ef9978bafc6d0de266ba65f61405283 Mon Sep 17 00:00:00 2001
From: Joseph Herlant <herla...@gmail.com>
Date: Mon, 14 Jul 2014 01:10:58 +0200
Subject: [PATCH 1/3] Class that provides base functions for using a Maildir
This class provides all the functions to help replacing tracker_control,
tracker_recieve_news and tracker_dispatch piping system by a unified system
that uses a Maildir. That will be more generic and could be used as a processing
queue to avoid out of memory issues generated using the tests with the piping
mechanism.
Some new settings have been added for this needs:
- DISTRO_TRACKER_MAILDIR_PATH that contains the path of the maildir where
mails
are recieved.
- DISTRO_TRACKER_NEWSMAIL_SUFFIXES that will define the list of the
suffixes
that any news local_part could end with.
---
data/.gitignore | 1 +
distro_tracker/mail/maildir_manager.py | 205 ++++++++++
distro_tracker/mail/tests/tests_maildir_manager.py | 451 +++++++++++++++++++++
distro_tracker/project/settings/defaults.py | 9 +
4 files changed, 666 insertions(+)
create mode 100644 distro_tracker/mail/maildir_manager.py
create mode 100644 distro_tracker/mail/tests/tests_maildir_manager.py
diff --git a/data/.gitignore b/data/.gitignore
index 1cc33c7..36e933c 100644
--- a/data/.gitignore
+++ b/data/.gitignore
@@ -1 +1,2 @@
distro-tracker.sqlite
+Maildir
diff --git a/distro_tracker/mail/maildir_manager.py b/distro_tracker/mail/maildir_manager.py
new file mode 100644
index 0000000..a3fec28
--- /dev/null
+++ b/distro_tracker/mail/maildir_manager.py
@@ -0,0 +1,205 @@
+# Copyright 2014 The Distro Tracker Developers
+# See the COPYRIGHT file at the top-level directory of this distribution and
+# at http://deb.li/DTAuthors
+#
+# This file is part of Distro Tracker. It is subject to the license terms
+# in the LICENSE file found in the top-level directory of this
+# distribution and at http://deb.li/DTLicense. No part of Distro Tracker,
+# including this file, may be copied, modified, propagated, or distributed
+# except according to the terms contained in the LICENSE file.
+"""
+Implements the processing of retrieving packages from the maildir and proceed
+to the mail processing.
+"""
+from __future__ import unicode_literals
+from django.conf import settings
+
+from distro_tracker.mail.control import process as control_process
+from distro_tracker.mail.dispatch import process as dispatch_process
+from distro_tracker.mail.mail_news import process as news_process
+
+import logging
+from mailbox import Maildir
+from email.message import Message
+from email.utils import parseaddr
+import os
+from rfc822 import Message as rfc822_Message
+
+DISTRO_TRACKER_CONTROL_EMAIL = settings.DISTRO_TRACKER_CONTROL_EMAIL
+DISTRO_TRACKER_MAILDIR_PATH = settings.DISTRO_TRACKER_MAILDIR_PATH
+DISTRO_TRACKER_NEWSMAIL_SUFFIXES = settings.DISTRO_TRACKER_NEWSMAIL_SUFFIXES
+logger = logging.getLogger(__name__)
+
+class MaildirManager(object):
+
+ def __init__(self):
+ """
+ Setting the maildir object first.
+ """
+ self.maildir = Maildir(DISTRO_TRACKER_MAILDIR_PATH, factory=None)
+ self.reset_message()
+ self.msg_filename = None
+
+ def get_emails_from_header(self, header):
+ """
+ This helper returns the list of recipients from a given header.
+ The header matching will be case insensitive.
+ """
+ recipients = []
+ # We lower the keys of the headers to ensure the case insensitivity
+ lower_headers = {key.lower():key for key in self.message.keys()}
+ if header.lower() in lower_headers.keys():
+ # Here we get the real header label back
+ header_label = lower_headers[header.lower()]
+ recipients = [parseaddr(item)[1]
+ for item in self.message[header_label].split(',')]
+ return recipients
+
+ def get_recipients(self):
+ """
+ This method gets the recipient from the current message. It first look
+ at the 'Envelope-to', 'X-Envelope-To' and 'X-Original-To' headers (in
+ this order).
+ If one of those is defined, it returns its content as a list.
+ If none of those are defined, it returns the list of mail adresses from
+ the To, Cc and Bcc headers.
+ """
+ recipients = self.get_emails_from_header('Envelope-to')
+ recipients.extend(self.get_emails_from_header('X-Envelope-To'))
+ recipients.extend(self.get_emails_from_header('X-Original-To'))
+
+ # Getting 'To', 'Cc', and 'Bcc' only if the previous fields are empty
+ if recipients == []:
+ recipients = self.get_emails_from_header('To')
+ recipients.extend(self.get_emails_from_header('Cc'))
+ recipients.extend(self.get_emails_from_header('Bcc'))
+
+ return recipients
+
+ def is_news_recipient(self, recipient):
+ """
+ This method tests if a recipient has the suffix of a news email
+ adress using DISTRO_TRACKER_NEWSMAIL_SUFFIXES.
+ Returns True if so, False if not.
+ """
+ local_part = recipient.rsplit('@', 1)[0]
+ for suffix in DISTRO_TRACKER_NEWSMAIL_SUFFIXES:
+ if local_part.endswith(suffix.lower()):
+ return True
+ return False
+
+ def get_mail_key_from_filename(self, filename, folder='new'):
+ """
+ Returns the key of the mail if exists from given path.
+ """
+ mail_key = None
+ self.maildir._refresh()
+ for k in self.maildir._toc:
+ if self.maildir._toc[k] == "{0}/{1}".format(folder,filename):
+ mail_key = k
+ break
+ return mail_key
+
+
+ def retrieve_mail_content(self, mail_file_name=None, mail_key=None):
+ """
+ Returns the message corresponding to the given id from the maildir
+ object or None if no message exists with this id.
+
+ It ensures that if a message is retrieved, it will be as an
+ ``email.message.Message`` class instance.
+
+ Returns `None` if the file name is not found.
+ """
+ # First we need to work wit the mail key if not provided
+ if mail_key is None and mail_file_name is not None:
+ mail_key = self.get_mail_key_from_filename(mail_file_name)
+
+ self.message = self.maildir.get(mail_key, default=None)
+
+ if self.message is None:
+ logger.debug("Unable to find mail file {0}".format(mail_file_name))
+ self.message = None
+ else:
+ self.msg_filename = mail_file_name
+
+ def reset_message(self):
+ """
+ Resets the self.message object.
+ """
+ self.message = Message()
+ self.msg_filename = None
+
+ def delete_mail(self, filename=None, mail_key=None):
+ """
+ Deletes a mail if exists from the filename which is the message id
+ generated by the make_msgid method.
+ And resets the current self.message if its id matches.
+ If no message_id is provided, the current self.message's id is taken.
+
+ This class support `self.message` to be either `email.message.Message`
+ class instance or an `rfc822.Message` class instance.
+ """
+ if mail_key is None:
+ if isinstance(self.message, rfc822_Message):
+ self.msg_filename = os.path.basename(self.message.fp._file.name)
+ else:
+ if self.message.get_filename() is not None:
+ self.msg_filename = self.message.get_filename()
+
+ # If no parameter is set, trying to set it.
+ if filename is None:
+ filename = self.msg_filename
+
+ # We need to work wit the mail key
+ mail_key = self.get_mail_key_from_filename(filename)
+
+ if mail_key is None:
+ logger.debug("Unable to find mail file {0} for deletion".format(filename))
+ else:
+ self.maildir.discard(mail_key)
+
+ # If given filename was given empty or was the current self.message id,
+ # reset the self.message
+ if self.msg_filename is not None and self.msg_filename == filename:
+ self.reset_message()
+
+ def process_mail_error(self, exception, mail_file_name=None, mail_key=None):
+ logger.error(
+ "Exception occured while trying to process {0} ({1}): {2}".format(
+ mail_file_name, mail_key, exception)
+ )
+
+ def process_mail(self, mail_file_name=None, mail_key=None):
+ """
+ First gets the mail in the Maildir from the given filename.
+ the sends the mail to the corresponding process method.
+ """
+ if mail_key is None:
+ self.retrieve_mail_content(mail_file_name=mail_file_name)
+ else:
+ self.retrieve_mail_content(mail_key=mail_key)
+ if self.message is None:
+ return
+
+ recipients = self.get_recipients()
+ flat_mail = self.message.__str__()
+ try:
+ for recipient in recipients:
+ recipient = recipient.lower()
+ if recipient == DISTRO_TRACKER_CONTROL_EMAIL.lower():
+ # Processes the given mail through the control process
+ control_process(bytes(flat_mail))
+ elif self.is_news_recipient(recipient):
+ # Processes the given mail through the mail_news process
+ news_process(bytes(flat_mail))
+ else:
+ # Processes the given mail through the dispatch process
+ dispatch_process(bytes(flat_mail), recipient)
+ # Deleting message when processed correctly
+ self.delete_mail(filename=mail_file_name, mail_key=mail_key)
+ except Exception as ex:
+ # Whenever an exception occurs in the mail processing, archive the
+ # mail to a specific folder.
+ self.process_mail_error(ex, mail_file_name, mail_key)
+
diff --git a/distro_tracker/mail/tests/tests_maildir_manager.py b/distro_tracker/mail/tests/tests_maildir_manager.py
new file mode 100644
index 0000000..e3bddad
--- /dev/null
+++ b/distro_tracker/mail/tests/tests_maildir_manager.py
@@ -0,0 +1,451 @@
+# -*- coding: utf-8 -*-
+
+# Copyright 2014 The Distro Tracker Developers
+# See the COPYRIGHT file at the top-level directory of this distribution and
+# at http://deb.li/DTAuthors
+#
+# This file is part of Distro Tracker. It is subject to the license terms
+# in the LICENSE file found in the top-level directory of this
+# distribution and at http://deb.li/DTLicense. No part of Distro Tracker,
+# including this file, may be copied, modified, propagated, or distributed
+# except according to the terms contained in the LICENSE file.
+"""
+Tests for :mod:`distro_tracker.mail.tracker_maildir_manager`.
+"""
+
+from __future__ import unicode_literals
+from django.conf import settings
+from django.core import mail
+from django.test import TestCase
+from django.utils import timezone
+from django.utils.six.moves import mock
+
+from distro_tracker.mail.maildir_manager import MaildirManager
+
+from distro_tracker.core.models import SourcePackageName
+from distro_tracker.core.models import SourcePackage
+from distro_tracker.core.models import News
+from distro_tracker.core.models import Subscription
+from distro_tracker.core.utils import verp
+from distro_tracker.mail.models import UserEmailBounceStats
+
+import mailbox
+from email.message import Message
+from email.utils import make_msgid
+
+DISTRO_TRACKER_CONTROL_EMAIL = settings.DISTRO_TRACKER_CONTROL_EMAIL
+DISTRO_TRACKER_CONTACT_EMAIL = settings.DISTRO_TRACKER_CONTACT_EMAIL
+DISTRO_TRACKER_FQDN = settings.DISTRO_TRACKER_FQDN
+DISTRO_TRACKER_MAILDIR_PATH = settings.DISTRO_TRACKER_MAILDIR_PATH
+
+class MaildirManagerTest(TestCase):
+
+ def setUp(self):
+ """
+ In the setup we set some default values.
+ """
+ self.maildir = mailbox.Maildir(DISTRO_TRACKER_MAILDIR_PATH)
+ self.original_mail_count = self.maildir.__len__()
+ # Initializing a new dummy package
+ self.package_name = SourcePackageName.objects.create(
+ name='dummy-package')
+ self.package = SourcePackage.objects.create(
+ source_package_name=self.package_name,
+ version='1.0.0')
+ # Setting message default header
+ self.message = Message()
+ self.set_default_headers()
+ # Initializing an instance of the MaildirManager
+ self.manager = MaildirManager()
+ # This array stores the id of the messages created during the tests
+ self.generated_mail_ids = []
+
+ def tearDown(self):
+ """
+ Discarding the messages created during the tests.
+ """
+ for msgid in self.generated_mail_ids:
+ self.manager.delete_mail(mail_key=msgid)
+
+ def set_default_headers(self):
+ """
+ Helper method which adds the default headers for each test message.
+ """
+ self.set_header('From', 'John Doe <john....@unknown.com>')
+ self.set_header('To',
+ '{package}@{distro_tracker_fqdn}'.format(
+ package=self.package_name,
+ distro_tracker_fqdn=DISTRO_TRACKER_FQDN
+ )
+ )
+ self.set_header('Subject', 'Commands')
+ self.set_header('Message-ID', make_msgid())
+
+ def set_header(self, header_name, header_value):
+ """
+ Sets a header of the test message to the given value.
+ If the header previously existed in the message, it is overwritten.
+
+ :param header_name: The name of the header to be set
+ :param header_value: The new value of the header to be set.
+ """
+ if header_name in self.message:
+ del self.message[header_name]
+ self.message.add_header(header_name, header_value)
+
+ def add_email_to_maildir(self, body, headers={}, encoding='ASCII'):
+ """
+ This helper adds the given mail message to the configured maildir
+ without using smtp or send any real mail
+ """
+ self.message.multipart = False
+ for header_name in headers.keys():
+ self.set_header(header_name, headers[header_name])
+ self.message.set_payload(body)
+ self.message.set_charset(encoding)
+ msgid = self.maildir.add(self.message)
+ self.generated_mail_ids.append(msgid)
+ return msgid
+
+ def assert_header_equal(self, header_name, header_value,
+ response_number=-1):
+ """
+ Helper method which asserts that a particular response's
+ header value is equal to an expected value.
+
+ :param header_name: The name of the header to be tested
+ :param header_value: The expected value of the header
+ :param response_number: The index number of the response message.
+ Standard Python indexing applies, which means that -1 means the
+ last sent message.
+ """
+ out_mail = mail.outbox[response_number].message()
+ self.assertEqual(out_mail[header_name], header_value)
+
+ def test_adding_email_to_maildir(self):
+ """
+ Testing the add_email_to_maildir helper method.
+ """
+ msgid = self.add_email_to_maildir(
+ body='We do not care about the body content',
+ headers={
+ 'Subject':'Mail from test_adding_email_to_maildir method',
+ },
+ )
+ final_mail_count = self.maildir.__len__()
+
+ self.assertEqual(final_mail_count, self.original_mail_count + 1)
+
+ def test_delete_mail_noreset_message(self):
+ """
+ Testing to delete a mail without setting the manager message object.
+ Just passing the message id to the method.
+ """
+ msgid = self.add_email_to_maildir('Some mail content')
+ final_mail_count = self.maildir.__len__()
+ self.assertEqual(final_mail_count, self.original_mail_count + 1)
+
+ self.manager.delete_mail(mail_key=msgid)
+ final_mail_count = self.maildir.__len__()
+ self.assertEqual(final_mail_count, self.original_mail_count)
+
+ def test_delete_mail_reset_message(self):
+ """
+ Testing to delete a mail after setting the message object.
+ """
+ msgid = self.add_email_to_maildir('Some mail content')
+ self.manager.message = self.maildir.get(msgid)
+ intermediate_mail_count = self.maildir.__len__()
+ self.assertEqual(intermediate_mail_count, self.original_mail_count + 1)
+
+ self.manager.delete_mail()
+ # Checking the message has been discarded
+ final_mail_count = self.maildir.__len__()
+ self.assertEqual(final_mail_count, self.original_mail_count)
+ # Checking that self.message has been reset.
+ self.assertIsInstance(self.manager.message, Message)
+ self.assertIsNone(self.manager.message.get_filename())
+
+ def test_reset_message(self):
+ """
+ Tests that the `reset_message()` reset the message object to a blank
+ new one.
+ """
+ self.manager.message.add_header('Subject', 'Commands')
+ self.assertEqual(self.manager.message.get('Subject'), 'Commands')
+
+ self.manager.reset_message()
+ self.assertIsInstance(self.manager.message, Message)
+ self.assertIsNone(self.manager.message.get('Subject'))
+
+ def test_retrieve_mail_content(self):
+ """
+ Tests that the retrieve_mail_content method sets correctly the
+ message object if a proper file is given.
+ """
+ msgid = self.add_email_to_maildir('Some mail content')
+ self.manager.retrieve_mail_content(mail_key=msgid)
+
+ self.assertIsInstance(self.manager.message, Message)
+ self.assertEqual(self.manager.message.get_payload(), 'Some mail content')
+
+ def test_retrieve_mail_content_not_exists(self):
+ """
+ Tests that the retrieve_mail_content method sets the message object
+ to `None` if an incorrect file name is given.
+ """
+ msgid = make_msgid()
+ self.manager.retrieve_mail_content(mail_key=msgid)
+
+ self.assertIsNone(self.manager.message)
+
+ @mock.patch('distro_tracker.mail.maildir_manager.logger.error')
+ def test_process_mail_error(self, mocked_method):
+ """
+ Checks that the `process_mail_error` method calls the logger.error.
+ """
+ self.manager.process_mail_error(Exception('My exception'), 'my_file')
+ self.assertTrue(mocked_method.called)
+
+ def test_get_emails_from_header(self):
+ """
+ This method tests that the get_emails_from_header method returns
+ the correct array.
+ """
+ recipients = ['Pkg 1 <packag...@domain.com>', 'packag...@domain.com']
+ self.set_header('DummyHEADER', ', '.join(recipients))
+ self.manager.message = self.message
+ ret = self.manager.get_emails_from_header('dummyHeAder')
+ self.assertEqual(ret, ['packag...@domain.com', 'packag...@domain.com'])
+
+ def _test_get_recipients_generic(self, header):
+ """
+ This method will test that adding a given header will return the
+ correct list of recipients using the get_recipients method.
+ """
+ test_mail = 'test_{0}@unknown.com'.format(header)
+ self.set_header(header, test_mail)
+ self.manager.message = self.message
+ returned_recipients = self.manager.get_recipients()
+ self.assertEqual(returned_recipients, [test_mail])
+
+ def test_get_recipients_envelope_to(self):
+ """
+ This method tests that the get_recipients method returns the
+ content of the Envelope-to field if present.
+ """
+ self._test_get_recipients_generic('Envelope-to')
+
+ def test_get_recipients_x_envelope_to(self):
+ """
+ This method tests that the get_recipients method returns the
+ content of the X-Envelope-To field if present.
+ """
+ self._test_get_recipients_generic('X-Envelope-to')
+
+ def test_get_recipients_x_original_to(self):
+ """
+ This method tests that the get_recipients method returns the
+ content of the X-Original-To field if present.
+ """
+ self._test_get_recipients_generic('X-Original-To')
+
+ def test_get_recipients_to_cc_bcc(self):
+ """
+ This method tests that the get_recipients method returns the
+ correct array if the 'Envelope-to', 'X-Envelope-To' and 'X-Original-To'
+ headers are not defined.
+ """
+ to_header = 't...@unknown.com'
+ cc_header = 'c...@unknown.com'
+ bcc_header = 'b...@unknown.com'
+ self.set_header('To', to_header)
+ self.set_header('Cc', cc_header)
+ self.set_header('Bcc', bcc_header)
+ self.manager.message = self.message
+ returned_recipients = self.manager.get_recipients()
+
+ self.assertEqual(returned_recipients,
+ [to_header, cc_header, bcc_header])
+
+ def test_is_news_recipient(self):
+ is_true = self.manager.is_news_recipient('package_n...@unknown.com')
+ is_false = self.manager.is_news_recipient('package_old...@unknown.com')
+
+ self.assertTrue(is_true)
+ self.assertFalse(is_false)
+
+ @mock.patch('distro_tracker.mail.maildir_manager.MaildirManager.process_mail_error')
+ def test_process_mail_simple_control_command(self, mocked_method):
+ """
+ This method tests that a simple mail coming in the control maildir
+ is processed through the control engine.
+ It checks that the mail has not been processed through the
+ process_mail_error method and has been deleted after process.
+ """
+ msg_id = self.add_email_to_maildir(
+ body='#command\n thanks',
+ headers={'To':DISTRO_TRACKER_CONTROL_EMAIL,},
+ )
+
+ self.assertEqual(self.maildir.__len__(), self.original_mail_count + 1)
+ self.manager.process_mail(mail_key=msg_id)
+
+ self.assertFalse(mocked_method.called)
+ self.assertEqual(self.maildir.__len__(), self.original_mail_count)
+ self.assertEqual(len(mail.outbox), 1)
+ self.assert_header_equal('Subject', 'Re: Commands')
+ self.assert_header_equal('X-Loop', DISTRO_TRACKER_CONTROL_EMAIL)
+ self.assert_header_equal('To', 'John Doe <john....@unknown.com>')
+ self.assert_header_equal('From', DISTRO_TRACKER_CONTACT_EMAIL)
+
+
+ def test_process_mail_simple_news_command(self):
+ """
+ This method tests that a simple mail coming in the contact maildir
+ is processed through the news mail engine.
+ """
+ subject = 'Some message'
+ content = 'Some message content'
+ msg_id = self.add_email_to_maildir(
+ body=content,
+ headers={
+ 'Subject':subject,
+ 'To':'dummy-package_news@' + DISTRO_TRACKER_FQDN,
+ 'X-Distro-Tracker-Package':self.package.name,
+ }
+ )
+
+
+ self.manager.process_mail(mail_key=msg_id)
+
+ # A news item is created
+ self.assertEqual(1, News.objects.count())
+ news = News.objects.all()[0]
+ # The title of the news is set correctly.
+ self.assertEqual(subject, news.title)
+ self.assertIn(content, news.content)
+ # The content type is set to render email messages
+ self.assertEqual(news.content_type, 'message/rfc822')
+
+ def test_process_mail_simple_package_command(self):
+ """
+ This method tests that a simple mail coming in the package maildir
+ is processed through the dispatch engine.
+ It also checks that processing utf-8 content is supported.
+ """
+ user_email = 'John Doe <john....@unknown.com>'
+ # Subscribing user to package
+ Subscription.objects.create_for(
+ package_name=self.package.name,
+ email=user_email,
+ active=True)
+ # Sending news mail
+ msg_id = self.add_email_to_maildir(
+ body='üößšđžčć한글',
+ headers={
+ 'Subject':'Some subject',
+ 'From':user_email,
+ 'X-Distro-Tracker-Approved':'1',
+ },
+ encoding='utf-8')
+
+ # Processing mail
+ self.manager.process_mail(mail_key=msg_id)
+
+ msg = mail.outbox[0]
+ # No exception thrown trying to get the entire message's content as bytes
+ content = msg.message().as_string()
+ # The content is actually bytes
+ self.assertTrue(isinstance(content, bytes))
+ # Checks that the message is correctly forwarded to subscriber
+ self.assertIn(user_email, (message.to[0] for message in mail.outbox))
+
+ def test_process_mail_bounce_recorded(self):
+ """
+ Tests that a received bounce is recorded.
+ """
+ bounce_address = 'bounces+{date}@{distro_tracker_fqdn}'.format(
+ date=timezone.now().date().strftime('%Y%m%d'),
+ distro_tracker_fqdn=DISTRO_TRACKER_FQDN)
+ dest='u...@domain.com'
+
+ Subscription.objects.create_for(
+ package_name='dummy-package',
+ email=dest,
+ active=True)
+ # self.user = EmailUserBounceStats.objects.get(user_email__email=dest)
+ self.user = UserEmailBounceStats.objects.get(email=dest)
+ msg_id = self.add_email_to_maildir(
+ body="Don't care",
+ headers={
+ 'Subject':'bounce',
+ 'Envelope-to':verp.encode(bounce_address, self.user.email)
+ },
+ )
+
+ # Make sure the user has no prior bounce stats
+ self.assertEqual(self.user.bouncestats_set.count(), 0)
+ self.manager.process_mail(mail_key=msg_id)
+
+ bounce_stats = self.user.bouncestats_set.all()
+ self.assertEqual(bounce_stats.count(), 1)
+ self.assertEqual(bounce_stats[0].date, timezone.now().date())
+ self.assertEqual(bounce_stats[0].mails_bounced, 1)
+ self.assertEqual(self.user.emailsettings.subscription_set.count(), 1)
+
+
+ @mock.patch('distro_tracker.mail.maildir_manager.control_process')
+ def _test_process_mail_control_for_case(self, test_email, mocked_method):
+ """
+ Tests that the process_mails method calls the control process whatever
+ the case is given.
+ """
+ msg_id = self.add_email_to_maildir(
+ body='#command\n thanks',
+ headers={'To':test_email,},
+ )
+
+ self.manager.process_mail(mail_key=msg_id)
+
+ self.assertTrue(mocked_method.called)
+
+ def test_process_control_mail_lowercase(self):
+ """
+ Tests that the process_mails method calls the control process when
+ control email is given in lower case.
+ """
+ self._test_process_mail_control_for_case(
+ DISTRO_TRACKER_CONTROL_EMAIL.lower())
+
+ def test_process_control_mail_uppercase(self):
+ """
+ Tests that the process_mails method calls the control process when
+ control email is given in upper case.
+ """
+ self._test_process_mail_control_for_case(
+ DISTRO_TRACKER_CONTROL_EMAIL.upper())
+
+ @mock.patch('distro_tracker.mail.maildir_manager.news_process')
+ def test_case_for_process_mail_news(self, mocked_method):
+ """
+ Tests that a mail from a non standard case for news is still processed
+ through the news mail engine.
+ """
+ subject = 'Some message'
+ content = 'Some message content'
+ suffix = settings.DISTRO_TRACKER_NEWSMAIL_SUFFIXES[0]
+ msg_id = self.add_email_to_maildir(
+ body=content,
+ headers={
+ 'Subject':subject,
+ 'To':'dummy-package{suffix}@{fqdn}'.format(suffix=suffix.title(),
+ fqdn=DISTRO_TRACKER_FQDN),
+ 'X-Distro-Tracker-Package':self.package.name,
+ }
+ )
+
+
+ self.manager.process_mail(mail_key=msg_id)
+
+ self.assertTrue(mocked_method.called)
diff --git a/distro_tracker/project/settings/defaults.py b/distro_tracker/project/settings/defaults.py
index 1136414..d056055 100644
--- a/distro_tracker/project/settings/defaults.py
+++ b/distro_tracker/project/settings/defaults.py
@@ -366,6 +366,15 @@ DISTRO_TRACKER_MAX_ALLOWED_ERRORS_CONTROL_COMMANDS = 5
#: The number of days a command confirmation key should be valid.
DISTRO_TRACKER_CONFIRMATION_EXPIRATION_DAYS = 3
+#: The maildir where the all the mails are received (control server mails,
+#: package news mails, bounces mails, and other package-related mails)
+DISTRO_TRACKER_MAILDIR_PATH = os.path.join(DISTRO_TRACKER_BASE_PATH, 'data', 'Maildir')
+#: Email adress possible suffixes that will make a mail to be processed as a
+#: news mail when using the tracker_maildir_manager managment command
+DISTRO_TRACKER_NEWSMAIL_SUFFIXES = ('_news',)
+#: The maximum number of process to spawn for fetching mails from the maildir
+DISTRO_TRACKER_MAILDIR_MAX_WORKERS = 10
+
#: The maximum number of news to include in the news panel of a package page
DISTRO_TRACKER_NEWS_PANEL_LIMIT = 30
--
2.0.0
From 77765dea0fe08521c4073254493c5b431fbb7cf4 Mon Sep 17 00:00:00 2001
From: Joseph Herlant <herla...@gmail.com>
Date: Mon, 14 Jul 2014 01:12:35 +0200
Subject: [PATCH 2/3] Adding a daemon as management command to watch a Maildir
This new management command is able to track the given Maildir for new mails
and asynchronously launch the process of the mail through a multiprocessing
pool.
This pool of processes is limited by a configured number to avoid the potential
out of memory issues seen when recieving a lot of mails with the
tracker_dispatch, tracker_control and tracker_recieve_news management commands.
---
.../management/commands/tracker_maildir_watcher.py | 110 +++++++++++++++++++++
.../mail/tests/tests_maildir_management_command.py | 66 +++++++++++++
2 files changed, 176 insertions(+)
create mode 100644 distro_tracker/mail/management/commands/tracker_maildir_watcher.py
create mode 100644 distro_tracker/mail/tests/tests_maildir_management_command.py
diff --git a/distro_tracker/mail/management/commands/tracker_maildir_watcher.py b/distro_tracker/mail/management/commands/tracker_maildir_watcher.py
new file mode 100644
index 0000000..723d39c
--- /dev/null
+++ b/distro_tracker/mail/management/commands/tracker_maildir_watcher.py
@@ -0,0 +1,110 @@
+# Copyright 2014 The Distro Tracker Developers
+# See the COPYRIGHT file at the top-level directory of this distribution and
+# at http://deb.li/DTAuthors
+#
+# This file is part of Distro Tracker. It is subject to the license terms
+# in the LICENSE file found in the top-level directory of this
+# distribution and at http://deb.li/DTLicense. No part of Distro Tracker,
+# including this file, may be copied, modified, propagated, or distributed
+# except according to the terms contained in the LICENSE file.
+"""
+Implements the management command which will watch the given maildir for
+incoming mails and spawn processes that will handle the new mails.
+This command will be used as a daemon which will manage the mails received in
+the DISTRO_TRACKER_MAILDIR_PATH destination.
+
+This process will take care of checking that the number of workers for each
+maildir is less than the configured values.
+"""
+from django.conf import settings
+from django.core.management.base import BaseCommand
+
+from distro_tracker.mail.maildir_manager import MaildirManager
+
+import logging
+from mailbox import Maildir
+from multiprocessing import Pool
+import os
+import pyinotify
+import signal
+import sys
+
+logger = logging.getLogger(__name__)
+
+DISTRO_TRACKER_BASE_PATH = settings.DISTRO_TRACKER_BASE_PATH
+DISTRO_TRACKER_MAILDIR_PATH = settings.DISTRO_TRACKER_MAILDIR_PATH
+MAILDIR_MAX_WORKERS = settings.DISTRO_TRACKER_MAILDIR_MAX_WORKERS
+
+class Command(BaseCommand):
+ """
+ A Django management command used to invoke the maildir manager daemon.
+ """
+
+ def handle(self, *args, **kwargs):
+ logger.info('Starting maildir watcher daemon')
+ # Initializing and instance of the class that will handle the
+ # management of the workers
+ handler = WorkersManager()
+ for sig in [signal.SIGTERM, signal.SIGINT, signal.SIGQUIT]:
+ signal.signal(sig, handler.handle_sigterm)
+
+ logger.info('Processing existing mails')
+ mdir_path_new = os.path.join(DISTRO_TRACKER_MAILDIR_PATH,'new')
+ mdir = Maildir(DISTRO_TRACKER_MAILDIR_PATH)
+ for f in os.listdir(mdir_path_new):
+ # Don't manage subdirectories
+ if os.path.isfile(os.path.join(mdir_path_new,f)):
+ handler.feed_worker(f)
+
+ logger.info('Launching Maildir watcher')
+ wm = pyinotify.WatchManager()
+ notifier = pyinotify.Notifier(wm, default_proc_fun=handler)
+ mask = pyinotify.IN_MOVED_TO | pyinotify.IN_CREATE
+ wm.add_watch(mdir_path_new, mask)
+ notifier.loop()
+
+
+class WorkersManager(pyinotify.ProcessEvent):
+ """
+ This class will manage the behavior of the daemon when detecting a new mail
+ arrived in the watched Maildir.
+ """
+
+ def my_init(self):
+ """
+ Standard constructor addon for pyinotify ProcessEvent class.
+ """
+ # Initializing a mutiprocessing Pool of workers
+ self.workers = Pool(processes=MAILDIR_MAX_WORKERS)
+
+ def handle_sigterm(self, signum = None, frame = None):
+ """
+ Method that manages the closing of the queues while catching SIG*.
+ """
+ logger.info("Closing process pool due to {0} signal.".format(signum))
+ self.workers.close()
+ self.workers.join()
+ sys.exit(0)
+
+ def process_default(self, event):
+ """
+ Trigger function for the default events in pyinotify. Used here to
+ handle both IN_MOVED_TO and IN_CREATE events.
+ """
+ logger.debug("Notification for {0} for {1}".format(
+ event.maskname, event.name))
+ self.feed_worker(event.pathname)
+
+ def feed_worker(self, file_name):
+ """
+ Asks the process pool to process the given file asynchronously.
+ """
+ self.workers.apply_async(worker_main, [file_name])
+
+def worker_main(file_name):
+ """
+ Function used by the process of the pool to launch the processing of mails.
+ """
+ mgr = MaildirManager()
+ mgr.process_mail(mail_file_name=file_name)
+
diff --git a/distro_tracker/mail/tests/tests_maildir_management_command.py b/distro_tracker/mail/tests/tests_maildir_management_command.py
new file mode 100644
index 0000000..b32f887
--- /dev/null
+++ b/distro_tracker/mail/tests/tests_maildir_management_command.py
@@ -0,0 +1,66 @@
+# Copyright 2014 The Distro Tracker Developers
+# See the COPYRIGHT file at the top-level directory of this distribution and
+# at http://deb.li/DTAuthors
+#
+# This file is part of Distro Tracker. It is subject to the license terms
+# in the LICENSE file found in the top-level directory of this
+# distribution and at http://deb.li/DTLicense. No part of Distro Tracker,
+# including this file, may be copied, modified, propagated, or distributed
+# except according to the terms contained in the LICENSE file.
+"""
+Tests for :mod:`distro_tracker.mail.management.commands.tracker_maildir_watcher`
+"""
+
+from django.conf import settings
+from django.utils.six.moves import mock
+from django.test import SimpleTestCase
+
+from distro_tracker.mail.management.commands.tracker_maildir_watcher import (
+ Command as MaildirWatcherCommand)
+from distro_tracker.mail.management.commands.tracker_maildir_watcher import WorkersManager
+
+from mailbox import Maildir
+from email.message import Message
+from email.utils import make_msgid
+
+class MaildirWatcherCommandTest(SimpleTestCase):
+ """
+ Tests for the tracker_mailidir_watcher management command:
+ :mod:`distro_tracker.mail.management.commands.tracker_maildir_watcher`
+ """
+ def test_handle_sets_notifier(self):
+ """
+ This function tests that the handle function sets a pyinotify trigger
+ on call and that the notifier will launch the watch process.
+ """
+ with mock.patch('pyinotify.WatchManager.add_watch') as mock_watcher:
+ with mock.patch('pyinotify.Notifier.loop') as mock_notifier:
+ cmd = MaildirWatcherCommand()
+ cmd.handle()
+
+ self.assertTrue(mock_watcher.called)
+ self.assertTrue(mock_notifier.called)
+
+ def test_handle_calls_feed_notifier_for_new_mails(self):
+ """
+ This function tests that when the daemon is started with new mails in
+ the maildir, the feed_worker method is called.
+ """
+ mdir = Maildir(settings.DISTRO_TRACKER_MAILDIR_PATH)
+ msg = Message()
+ msg.add_header('From', 'John Doe <john....@unknown.com>')
+ msg.add_header('To', 'dont.c...@unknown.com>')
+ msgid = make_msgid()
+ msg.add_header('Message-ID', msgid)
+ mdir.add(msg)
+
+ with mock.patch('distro_tracker.mail.management.commands.tracker_maildir_watcher.WorkersManager.feed_worker') as mock_worker:
+ # This function needs to be mocked to avoid starting the daemon
+ with mock.patch('pyinotify.Notifier.loop') as mock_notifier:
+ cmd = MaildirWatcherCommand()
+ cmd.handle()
+
+ self.assertTrue(mock_worker.called)
+
+ mdir.discard(msgid)
+
--
2.0.0
From 0ee30f0246f0f4988209a4b85b91b954091f6ff7 Mon Sep 17 00:00:00 2001
From: Joseph Herlant <herla...@gmail.com>
Date: Mon, 14 Jul 2014 01:13:27 +0200
Subject: [PATCH 3/3] Documentation about the new Maildir feature of the
mailbot
The exim4 configuration has been done and tested, but the postfix configuration
still needs to be tested and completed.
---
docs/setup/mailbot.rst | 122 +++++++++++++++++++++++++++++++++++++++++++++++--
1 file changed, 119 insertions(+), 3 deletions(-)
diff --git a/docs/setup/mailbot.rst b/docs/setup/mailbot.rst
index 44bcbc6..b1c0e45 100644
--- a/docs/setup/mailbot.rst
+++ b/docs/setup/mailbot.rst
@@ -27,9 +27,125 @@ choosing. You should modify the following values in
.. note::
These emails are allowed to be on different domains.
+
+The next step, if you are using the mail management via generic Maildir, is
+to modify the following values in ``distro_tracker/project/settings/local.py``:
+
+* DISTRO_TRACKER_MAILDIR_PATH
+
+ The maildir where the control, package news and other package related mails
+ are received
+
+
+Mail management via a Maildir (recommended)
+-------------------------------------------
+
+In order to have the mails from the Maildir properly processed, the
+``tracker_maildir_management`` daemon should be started using the following
+management command:
+:mod:`tracker_maildir_manager <distro_tracker.mail.management.commands.tracker_maildir_manager>`
+
+This command will handle the mails coming in the `new` subdirectory of the
+Maildir directory previously defined. If this directory do not exist, the
+Maildir will not be created and the daemon will not work properly.
+
+.. note::
+
+ If you go ahead with this mail management method, you whould ensure that
+ the daemon has rights to read and write on all the configured maildir
+
+Prerequisites
+^^^^^^^^^^^^^
+
+This module requires ``pyinotify`` and ``multiprocessing`` modules to run.
+
+Exim4 configuration
+^^^^^^^^^^^^^^^^^^^
+
+Mails received by the ``DISTRO_TRACKER_CONTROL_EMAIL``, the bounce, news and
+other package related mails shoud be redirected to the
+``DISTRO_TRACKER_MAILDIR``. To configure this, you can use this router and
+transport as a simple example::
+
+ distro_tracker_router:
+ debug_print = "R: Distro Tracker catchall router for $local_part@$domain"
+ driver = accept
+ transport = distro_tracker_transport
+
+ distro_tracker_transport:
+ debug_print = "T: Distro Tracker transport for the catchall Maildir"
+ driver = appendfile
+ directory = /home/dtracker/distro-tracker/data/Maildir
+ user = dtracker
+ group = mail
+ create_directory
+ envelope_to_add
+ maildir_format
+ directory_mode = 0700
+ mode_fail_narrower = false
+
+.. note::
+
+ The router should be placed last in the routers section of exim
+ configuration file.
+
+ It is advisable to use the envelope_to_add option to ensure that the real
+ recipient of the mail (even if it's cc or bcc) is correctly identified
+ by the daemon. The fields 'Envelope-to', 'X-Envelope-To' and 'X-Original-To'
+ will be the first to be checked by the daemon when looking for the
+ recipient's email.
+
+
+Postfix configuration
+^^^^^^^^^^^^^^^^^^^^^
+
+.. note::
+
+ This configuration is to be defined. It would be advisable to have the
+ 'X-Original-To' in the final headers list (should be added automatically by
+ postfix, but it's still to be verified). The following configuration is a
+ non-tested draft that needs to be completed to include the redirection of the
+ catch-all to a maildir.
+
+Assuming the following configuration::
+
+ DISTRO_TRACKER_CONTACT_EMAIL = owner@distro_tracker.debian.net
+ DISTRO_TRACKER_CONTROL_EMAIL = control@distro_tracker.debian.net
+ DISTRO_TRACKER_FQDN = distro_tracker.debian.net
+
+The file ``/etc/postfix/virtual`` would be::
+
+ distro_tracker.debian.net not-important-ignored
+ postmaster@distro_tracker.debian.net postmaster@localhost
+ owner@distro_tracker.debian.net dtracker-owner@localhost
+ # Catchall for package emails
+ @distro_tracker.debian.net dtracker-dispatch@localhost
+
+The ``/etc/aliases`` file should then include the following lines::
+ dtracker-owner: some-admin-user
+
+Then, the ``main.cf`` file should be edited to include::
+
+ virtual_alias_maps = hash:/etc/postfix/virtual
+
+.. note::
+
+ Be sure to run ``newaliases`` and ``postmap`` after editing ``/etc/aliases``
+ and ``/etc/postfix/virtual``.
+
+This way, all messages which are sent to the owner are delivered to the local
+user ``some-admin-user``, messages sent to the control address, messages which
+should be turned into news items and messages sent to any other address on the
+given domain should be redirected to a single maildir that would be handled by
+the daemon.
+
+
+Mail management through pipes (deprecated)
+------------------------------------------
+
Management commands
--------------------
+^^^^^^^^^^^^^^^^^^^
In order to have the received email messages properly processed they need to
be passed to the following management commands:
@@ -44,7 +160,7 @@ means that the system's MTA needs to be setup to forward appropriate mails to
the appropriate command.
Exim4
------
+^^^^^
Mails received to ``DISTRO_TRACKER_CONTROL_EMAIL`` address should be piped to the
``control_process`` command. A way to set this up in Exim would be to create a
@@ -96,7 +212,7 @@ are not recognized. Such router and transport could be::
This router should be placed last in the exim configuration file.
Postfix
--------
+^^^^^^^
To configure Postfix to forward email messages to appropriate commands you need
to first create a file with virtual aliases for the relevant email addresses.
--
2.0.0
--- End Message ---