Just came across this. MarkMail is a free service for searching mailing list archives, with huge advantages over traditional search engines<http://markmail.org/docs/faq.xqy#markmailperks>. It is powered by MarkLogic Server: Each email is stored internally as an XML document, and accessed using XQuery. All searches, faceted navigation, analytic calculations, and HTML page renderings are performed by a small MarkLogic Server cluster running against millions of messages. http://markmail.org/
Dorai On Thu, Jul 17, 2008 at 8:02 AM, Jeff Rush <[EMAIL PROTECTED]> wrote: > Anand Balachandran Pillai wrote: > >> Hi Pypers, >> >> Is there any open source tool for analyzing mailman archives ? >> I want to analyze our mailman archives and then find out the following >> information. >> >> - Total number of messages >> - Total number of threads (conversations) >> - Total number of unique posters >> - Maximum size of a thread >> - Top 5 posters >> - Top 5 threads (in terms of size) >> >> Are you aware of any tool (preferably Python) which does this ? The >> tool should be client-side, taking the URL to the mailman archives >> page as the only input. >> >> If there is nothing like this, perhaps I could think of writing one. It >> would be useful I guess... >> > > I'm not aware of any such tool but it would be quite useful. If you > produce a library for obtaining the data, I would then hook it into the > rrdtool (round-robin database) and produce graphs of traffic on various > mailing lists. This would help identify growth rates, when to split a list, > dying lists, etc. which can help others to manage better. Have a "top 5 > posters" and "top 5 threads" would be useful on the front page of many > usergroup websites to encourage others to join in. > > I would agree that it should be client-side since not all archive sites > would update Mailman just to use it. It also should cache data and not > re-fetch "finished" (i.e. prior months) list archives it has already > analyzed. It should not, of course, keep a complete copy of the archive, > just a summary, by interval of time like month. Keep the data in SQLite or > shelve, to keep database needs lightweight for easier integration with > anyone's choice of web engine. > > "Mailwatcher" is born? > > -Jeff > > _______________________________________________ > BangPypers mailing list > BangPypers@python.org > http://mail.python.org/mailman/listinfo/bangpypers > -- Dorai Thodla (http://www.thodla.com) Thinking about Technology Innovation and Learning My DailyLog (http://dorai.tumblr.com/) - Stuff worth remembering US: 650-206-2688, India: 98408 89258
_______________________________________________ BangPypers mailing list BangPypers@python.org http://mail.python.org/mailman/listinfo/bangpypers