I'm doing some data normalisation, which involves data from a Web site being extracted with BeautifulSoup, cleaned up with a regex, then having the current year as returned by time()'s tm_year attribute inserted, before the data is concatenated with string.join() and fed to time.strptime().
Here's some code: timeinput = re.split('[\s:-]', rawtime) print timeinput #trace statement print year #trace statement t = timeinput.insert(2, year) print t #trace statement t1 = string.join(t, '') timeobject = time.strptime(t1, "%d %b %Y %H %M") year is a Unicode string; so is the data in rawtime (BeautifulSoup gives you Unicode, dammit). And here's the output: [u'29', u'May', u'01', u'00'] (OK, so the regex is working) 2008 (OK, so the year is a year) None (...but what's this?) Traceback (most recent call last): File "bothv2.py", line 71, in <module> t1 = string.join(t, '') File "/usr/lib/python2.5/string.py", line 316, in join return sep.join(words) TypeError -- http://mail.python.org/mailman/listinfo/python-list