On Mon, Feb 15, 2010 at 2:47 PM, BJ Swope <bigbluesw...@gmail.com> wrote: > On Mon, Feb 15, 2010 at 2:31 PM, Stephen Hansen <apt.shan...@gmail.com> wrote: >> On Mon, Feb 15, 2010 at 10:53 AM, BJ Swope <bigbluesw...@gmail.com> wrote: >>> >>> File "/usr/lib/python2.5/email/_parseaddr.py", line 142, in mktime_tz >>> if data[9] is None: >>> TypeError: 'NoneType' object is unsubscriptable >>> >>> I'm parsing a bunch of spam and using the date field from the spams >>> for a date-time stamp. >>> >>> I've fixed the lib on my box to place the call inside a try/except >>> clause to catch the exception now, but it seems the module has a bug >>> in it. >> >> While there may or may not be a bug in the library, I don't think its where >> you're fixing. Just because an exception occurs in a function doesn't mean >> that function is broken: its documented as accepting a 10 item tuple, only. >> Functions in the stdlib generally -should- throw exceptions on invalid >> input. >> Someone's passing None into it, which its not allowed to do. So -that's- >> where the bug probably is, I think. (Although it may not be the immediate of >> mktime_tz; it could be happening higher up on the stack) >> Advice: Always post complete tracebacks to c.p.l/python-list :) >> --S > > > From the module: > > def mktime_tz(data): > """Turn a 10-tuple as returned by parsedate_tz() into a UTC timestamp.""" > if data[9] is None: > # No zone info, so localtime is better assumption than GMT > return time.mktime(data[:8] + (-1,)) > else: > t = time.mktime(data[:8] + (0,)) > return t - data[9] - time.timezone > > > It appears that the module is trying to accommodate the potential > missing TZ data because poorly written emails are missing the TZ data. > > I discarded all the crontab emails that had the full traceback in > them. I took out the try/except clause in the hopes that I'll get > another exception soon. > > If I do I'll post the entire exception traceback. >
Speak of the devil and demons appear... /logs/python/imap_fetcher/spam_serv1.bigbluenetworks.com.py Parsing of emails for spam at serv1.bigbluenetworks.com failed. Traceback (most recent call last): File "/logs/python/imap_fetcher/spam_serv1.bigbluenetworks.com.py", line 81, in <module> clean_stale_mail() File "/logs/python/imap_fetcher/spam_serv1.bigbluenetworks.com.py", line 24, in clean_stale_mail utc_msg_date = email.utils.mktime_tz(msg_date2) File "/usr/lib/python2.5/email/_parseaddr.py", line 142, in mktime_tz if data[9] is None: TypeError: 'NoneType' object is unsubscriptable def clean_stale_mail(): msg_date1= the_email.get('Date') msg_date2 = email.utils.parsedate_tz(msg_date1) try: utc_msg_date = email.utils.mktime_tz(msg_date2) except OverflowError: M.store(msg_id, '+FLAGS.SILENT', '\\Deleted') return utc_stale_date = time.time() - (86000*stale_days) if utc_msg_date <= utc_stale_date: M.store(msg_id, '+FLAGS.SILENT', '\\Deleted') try: #M = imaplib.IMAP4(HOST) M = imaplib.IMAP4_SSL(HOST) M.login(USER, PASSWD) M.select(MAILBOX) response, msg_list = M.search(None, 'ALL') # response is IMAP response, msg_list is list of Message IDs for msg_id in msg_list[0].split(): #msg_list[0] is a space separated string of message ids response, message = M.fetch(msg_id, '(RFC822)') # response is the IMAP response, message is an RFC222 message msg = email.FeedParser.FeedParser( ) msg.feed(message[0][1]) the_email = msg.close() clean_stale_mail() if the_email.is_multipart(): for part in the_email.walk(): if part.get_content_type() == 'text/html': decoded_part = part.get_payload(decode=1) soup_parse(decoded_part) elif the_email.get_content_type() == 'text/html': decoded_part = the_email.get_payload(decode=1) soup_parse(decoded_part) elif the_email.get_content_type() == 'text/plain': msg_payload = the_email.get_payload() manual_parse(msg_payload) else: continue #print msg_id, "Did't match any defined content types..." #print the_email M.expunge() M.close() M.logout() -- http://mail.python.org/mailman/listinfo/python-list