Hello, as I got access to VM with big debian mailinglist archives only yesterday, so this week I was only playing with my small archives. This week I spent on measuring time - how fast is my current implementation, specially generating threads/trees of emails. And I got conclusion that also for smaller archives (=~ 5000 emails) it is not so fast. It was about 3-5 seconds to generate tree from sqlite database. Biggest problem is to join emails without in-reply-to header to thread with similar/same subject. So I started writing new code (and changing format of tables in database) for email threading to have fast access to all emails in specific thread in cost of decreasing speed of adding new emails. Also I'm changing algorithm for generating tree of emails from DAG which comes from database (from in-reply- to and references headers). I decided to use slightly modified topological sorting from bottom of DAG to deal with missing emails in threading view.
-- Pali Rohár [email protected]
signature.asc
Description: This is a digitally signed message part.
_______________________________________________ Soc-coordination mailing list [email protected] http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/soc-coordination
