Eric Wong wrote: > Bob Proulx <b...@proulx.com> wrote: > > Eric Wong wrote: > > > OK, so I'm following half the recommendations > > > > > > The ones I'm going against are: > > > > > > generic_nonmember_action=hold (I want Accept) > > > default_member_moderation=yes (I want no) > > > > May I try to convince you otherwise? Because there are good reasons > > for the recommended settings. > > Not unless the maximum delay can be minutes. In other words, > similar to what greylisting gets without any human interaction.
The initial contact delay is the hill being defended? On a mailing list that may have many interactions over time. You and I might be discussing some topic. Say the topic of mailing list operations. :-) We may send many messages back and forth on the mailing list. This might go on for years and years over many topics. Each of those happen fast and efficiently. And it is not the continuing problem of spam to the mailing list that is a problem. That spam is okay. But it is the very first initial contact email message delay that is the showstopper? It's beyond the pale? How about SMTP time greylisting? I would gather from this discussion so far that SMTP greylisting, which is exactly the same and creates a delay upon the initial contact, would also be a showstopper too then? Greylisting at SMTP time would also be beyond the pale? I am sorry but IMNHO it is the daily day to day operations that are much more important to optimize and make efficient. Because those are things that happen repeatedly, day after day. One time startup costs should not be too onerous, but may have some cost in order to have benefit. Like greylisting. But it is the repeated operations that I think should be targeted for optimization. And that is the normal day to day use of the mailing lists without having them filled with spam. > > > So, should I remove listhel...@gnu.org from moderators? > > > I still want automated spam filters such as SpamAssassin, though. > > > > The listhelper anti-spam SpamAssassin et al cancel-bot depends upon > > the hold actions. If messages do not get held then it has no ability > > to filter spam. That's fundamental to how it works with Mailman. > > That's unfortunate. I'm not familiar with Mailman, but can't > the MTA feed the message through spam filters before Mailman > ever sees it? It's interesting that you mention that. Because for years and years the frontend anti-spam was poor. Very poor. And this is not a reflection upon the current FSF staff who have inherited the present situation. But that is the traditional situation. For a very long time the frontend anti-spam has been very poor. And therefore we have been implementing the anti-spam portion mostly in the Mailman interface where it is possible for volunteers to interact with the system. There has been discussion of how to improve the frontend anti-spam. At this time the systems are getting OS upgrades. Those are dearly needed. And obviously a first step in the improvement of the system. And there have been discussion about what needs to be done to improve the frontend anti-spam. This is starting to happen. But is still going to take a while from now to be improved. As with many things life and time is what keeps everything from happening all at once. However given the flow of mail and spam there needs to be a way to train the learning engines. As we just mentioned in the previous emails in our thread. Right now Mailman provides a reasonably convenient hook location to provide that training. One that is not as easy to do without the mailing list manager. Improving the feedback location in the flow of email is something to look at doing. But there is a lot of associated work that needs to happen first before working on that aspect of the problem. > I use mlmmj for legacy mailing list subscribers, that just runs > off cron with no synchronous relationship with the MTA at all. > I have replay script which makes it incrementally read mail from > public-inbox (git). If we are going to start listing out mailing list management software that is better than Mailman then we had better get comfortable. It's a long list! I am not a fan of Mailman. Mailman presents a pretty low threshold. I would start with Smartlist which is very capable and scales well. Also I have long been a fan of the way ezmlm works, if only it didn't require qmail. And at one time I would have said that Enemies of Carlotta had interesting features for a mailing list. For that matter I actually like the venerable old Majordomo. One of the very active mailing lists I interact with still to this day uses Majordomo for it! But Mailman is an official GNU Project. There is a benefit to "eating your own dogfood" as the saying goes. That and due to other reasons the lists.gnu.org machine is likely to continue to run GNU Mailman instead of other mailing list manager programs for a while to come. > 100% agreed. I've been using an inotify + Maildir-based > training system since 2008 or so spamc, even pre-public-inbox: > > https://public-inbox.org/dc-dlvr-spam-flow.html I looked at the mail flow through the diagram and without having spent a huge amount of time understanding it the flow looks similar to the way other sites do this. As users read mail and determine that a message is spam or non-spam they divert mail to different places and based up on those places the learning engines are trained-on-error. That's great! I do that too on my non-gnu systems end user mailboxes. But that isn't really applicable to the way a mailing list works. Because a mailing list delivers (forwards) mail to other people. The delivery of spam to other people's mailbox is very bad. And it is difficult for implementing distributed training feedback from the community. We can't not deliver a message that is spam after already having delivered it. > Spam gets trained upon removal from archives. Your preferred system (AFAICT) is one of a centralized storage without delivery. Because there is no delivery it does not deliver spam and that spam can be removed "quietly behind the scenes" as it were. That is what Google does with Gmail too. And others. However that is not a mailing list. It's something different. It is more similar to a web forum. Even if it is also different in many ways from a web forum. It feels more similar than it is different. If I am a subscriber to a mailing list and it passes along spam then I will receive that spam. (Where I can filter it out on my end but that is already too late to prevent the delivery of it.) Many people would object to the centralized storage based system because it is centralized and creates an environment where a cabal could, 1984 style, remove historical messages and rewrite history. Don't like what someone said? Simply remove that message from the storage. Or without malice there is the possibility of technical failure. A storage failure without backup would lose the entire mailing list history. These problems are not possible in a traditional mailing list as those historical messages already were sent and became part of the historical record. And they were distributed among all participants. Everyone has a copy. > > > > The resulting process means that as a general statement project > > > > mailing lists need no explicit maintenance. If you as a project > > > > maintainer and also a maintainer of the mailing list do nothing then > > > > everything happens as needed anyway. You are however free to be as > > > > involved in the mailing lists as you want. > > > > > > So if I'm away and unable to administer dtas-...@nongnu.org, and > > > generic_nonmember_action is "Hold"; does the "human team" at GNU > > > will eventually accept postings in my absence? > > > > Yes. Eventually usually means a few hours. > > <snip> yikes, that seems like a lot of human labor :< No. It's only a few minutes a day. While typing this message I switched over to the other window and ran through the mail queues. It took less than two minutes before I was done and flipped back to this message. Everything was mostly caught up. There were only a dozen messages needing review at this moment. Other listhelpers had been at work. We interlace randomly. There was no heavy spam wave hitting the system needing a custom rule written. Just the normal routine activity. A couple of minutes. Note that I am NOT clicking around in the Mailman web interface. I am either in 'mutt' looking at mail from the moderation emails, or running scripts which are doing things. There is no mouse activity involved at all. That would definitely be tedious. > > It is your mailing list and this is up to you. But people tend to be > > very intolerant of spam on mailing lists. > > It depends on the quantity, I suppose. vger.kernel.org lets a > few through and nobody seems to mind. (I'm just a subscriber > on vger, not an admin) And lists.gnu.org has infrequent spam slip through too. No system is perfect. And there are human mistakes at times. Humans have a non-zero bit-error-rate after all. Worse than the automation actually. > > For example if people receive their mail at Gmail or Yahoo or > > wherever, and then spam to the mailing list is received at their > > mailbox, and they push the Spam button, this teaches Google and Yahoo > > and so forth that lists.gnu.org is a source of spam and may create > > problems for normal mailing list delivery. This has been more of a > > problem with Yahoo than most other places. Some spam is of course > > inevitable but we try to keep it to a minimum. If it becomes a > > problem then if not us volunteers then FSF admin will need to become > > involved. Getting blacklisted due to spam is a pain to deal with. > > Yes, that is a problem. It's part of the reason public-inbox is > slowly moving mailing lists into a "pull" subscriber model over > NNTP/Atom/HTML (and maybe even POP3). That's great! For public-inbox users. Which is not a mailing list. Most of what has been said about non-delivery and central storage also applies to web forums. And people who like web forums often say they like it for all of the same reasons. However I personally really hate using web forums. For some of the same reasons! > > The only thing we really must insist upon is to discard spam and not > > reject spam. Most spam uses forged from addresses. Therefore > > rejecting spam ala Mailman Reject usually sends a rejection message to > > an innocent 3rd party who then gets "backscatter" spam. They validly > > report lists.gnu.org as a spam source in that case and it gets us in > > trouble with the DNSBLs. Therefore please do not Reject random spam > > messages. > > Right. One of my concerns with increased reliance on whitelisting > is that spammers will start using whitelisted addresses themselves. > SPF might discourage that, though. It's somewhat of a scary potential avenue for abuse. One that has only been infrequently targeted. But SPF, DKIM, and so forth helps with preventing the forgeries. Many sites do not use those however and are still subject to delivery of forgeries from those sites. I have been thinking of ways to defend this particular potential abuse avenue on the mailing lists, because it prickles at me. Hopefully in the arms race between user and abuser the user will win. > Fwiw, vger.kernel.org just drops HTML, which seems to cut a lot > of spam, too. They also do greylisting from what I can discern. For you and I if the mail is HTML then I can drop it without any real loss of signal. (How do you like that opinionated comment!) However for a LOT of other people they believe just as strongly that they want to send HTML mail. Just recently on the 'mutt' users mailing list there was a netizen of long standing who started a discussion asking how could they use mutt to send HTML email? I found the statement rather shocking! Who would be a mutt user but also be embracing HTML mail? But so it is. And many freemail sites make it impossible to avoid sending html mail. Simply dropping html mail is not a practical solution, regardless of how much I would wish the world would do so. For most of the mailing lists we have Mailman convert the html to plain text and that seems to be the acceptable compromise. Bob