Based on the forwarded message below, I think it is necessary that you inform list subscribers that the list is to be publicly archived at a place where their addresses will be in full view for crawlers, in particular to email address collectors that sell the addresses to spammers and whatnot.
In contrast, at mail-archive.com, they do everything they can to 1) hide email addresses 2) disallow crawlers. In general, the decent thing to do is first discuss it with existing subscribers before you channel the lyx lists to newsgroups and mailarchives. In particular, let them know the risks as far as spamming and viruses are concerned. I do not see any messages at the devel or users' list about this archiving at marc.theaimsgroup.com. (My grep for theaims turned up nothing on the list archives.) Also, the lyx webpage may want to discuss the risks of subscribing to the lyx lists. Let me know the result of the discussion; then I will act according to the result. Mate ----- Forwarded message from Hank Leininger <[EMAIL PROTECTED]> ----- Date: Tue, 18 Jun 2002 02:06:01 -0400 (EDT) From: Hank Leininger <[EMAIL PROTECTED]> Subject: Re: [EMAIL PROTECTED] [EMAIL PROTECTED] [EMAIL PROTECTED] To: <[EMAIL PROTECTED]> cc: John Levon <[EMAIL PROTECTED]>, <[EMAIL PROTECTED]>, <[EMAIL PROTECTED]>, <[EMAIL PROTECTED]> On Mon, 17 Jun 2002 [EMAIL PROTECTED] wrote: > On Sun, Jun 16, 2002 at 03:08:00PM -0400, Hank Leininger wrote: [snip] > > Could you guys disable the X-No-Archive setting for the lyx lists? > > I'll address this tomorrow. What I do for the existing achiving at > mail-archive.com is I set up a sublist which sends mail only to the > webarchive---and I remove the header X-No-Archive only for the sublist. Ah, cool. OK. Currently our archives are subscribed as 'lyx-*@progressive-comp.com' (should be the only @progressive-comp.com address on each list). I guess you could move those subscriptions to the non-header'ed sublist? > What do I need to do to have the sublists send mail to your archive? > Do you guys do something against email and other address collector > software? Do you hide addresses? Good question. We don't do a whole lot. Since most of our list archives are unofficial, we've never done any address-munging--for one, people could then complain that they were being quoted w/o proper attribution--and sue for copyright violations. (This has actually happened, though not to us.) For list-owners I think they can safely set whatever policy they want-- by joining and posting to a list a participant agrees to explicit and implicit list terms. But for third parties it's stickier :-P Also, MARC currently knows about >890,000 authors (from about 9.2 million emails). Since we do make authors searchable (no wildcards though), and trackable (click on an author and see whatever else they've posted that we've got), chopping addresses before the @ would cause lots and lots of collisions and therefore misattributions (the John Wilson problem[1]). But anyway, as for what we do do: the interface tries to be harvester- unfriendly. Any view which lists multiple people (lists of messages, lists of threads, lists of authors, lists of search results, etc) do not display full addresses, only the comment, lhs, whatever. Individual messages have the full sender (and any email addresses within the body are unmolested), though. However message IDs and author IDs are not sequential[2], so one can't simply iterate through ++'ing each time harvesting addresses. We also monitor request volume/rate per source, and throttle (or simply block) people running unfriendly robots (this is not yet as automated as it should be). Sure, you could write something a bit more intelligent which walks the site correctly and slowly enough, but at that point I think there are far, far easier ways to harvest addresses elsewhere. If that did start to happen it should still be a noticable blip in the monthly usage stats, so it'd be apparent this was no longer enough. Of course, what brought this all up is: we honor X-No-Archive's, so if an individual chooses to use them, we'll silently skip over their messages. Until someone replies and quotes them and includes their entire mail including .sig without any trimming :( Let me know if that makes you feel reasonably comfortable; I'd understand if it didn't. Thanks, Hank Leininger <[EMAIL PROTECTED]> E407 AEF4 761E D39C D401 D4F4 22F8 EF11 861A A6F1 [1] http://marc.theaimsgroup.com/?l=cryptography&m=102269473203419&w=2 [2] With the exception of bulk-added mails; message IDs are sequential for batches of mail inserted together :( Author numbers are still not, however. I plan to move to md5sum-derived external keys for most things, which will address this along with many other things. But no promises, or even wild guesses about when that will happen. ----- End forwarded message ----- -- --- Mate Wierdl | Dept. of Math. Sciences | University of Memphis