: On Sun, Oct 27, 2013 at 03:35:40PM +1100, Chris Angelico wrote: > On Sun, Oct 27, 2013 at 2:32 PM, Steven D'Aprano > <steve+comp.lang.pyt...@pearwood.info> wrote: > > If anyone wants to modify the script to determine the ratio of posters, > > rather than posts, using GG, be my guest. > > And if anyone does, do please post the result on-list.
Taking a different tack, since I happen to have a complete[1] local archive of python-list going back a few years ... here's a quick and dirty script to count unique senders and Google Groups users for this year: - - - import os from email.parser import HeaderParser LIST = "python-list@python.org" MAILDIR = "/path/to/mail/archive/cur" YEAR = "2013" parser = HeaderParser() found = set() gg_users = 0 for filename in os.listdir(MAILDIR): with open(os.path.join(MAILDIR, filename)) as message: headers = parser.parse(message) sender = headers.get("from", "") dest = headers.get("to", "") date = headers.get("date", "") if (LIST not in dest) or (YEAR not in date) or (sender in found): continue found.add(sender) if "groups-ab...@google.com" in headers.get("complaints-to", ""): gg_users += 1 print("GG user:") print(sender) print("Senders: %d" % len(found)) print("GG users: %d" % gg_users) print("---") - - - It's obviously not very robust, but I reckon it's good enough to get an idea what's going on. The results: Senders: 1701 GG users: 879 ... so just over 50%. If anyone wants the complete output, just let me know and I'll email it privately. -[]z. [1] except for spam filtered out by Gmail. -- Zero Piraeus: ad referendum http://etiol.net/pubkey.asc -- https://mail.python.org/mailman/listinfo/python-list