Hi, > > For the perl version, spamd+spamc solution (i would call it a messy > > hack) is a workaround for perl's 'booting/startup' overload.
> It's really not so messy of a hack, and it's designed for a couple > purposes: > > 1. work around perl's 'booting/startup' overload (though this would be > more easily done with dump on platforms supporting that) i've heard something about perl->elf compiler, but dunno how it work. do you mean it, or something else? > 2. Reduce slow I/O on the machine by reading global config files just > once from disk, then forking. This relies on the OS doing copy-on-write > stuff for the memory pages in the forked process, but most OSes these > days do that. Otherwise you probably lose the I/O advantage when you > copy the process' memory space on fork. does it worth? for the price of sending mails through local network (io)? (for parsing/compiling reegxps, configfiles i agree, but i eliminated it by preprocessing and compiling into the C binary) > 3. Allow for far greater loads than will fit on a single mail processing > machine (regardless of how many CPUs you cram in your starfire box) by > enabling the processing load to be spread around a network. The network ok, you're right here, i must agree. i didn't think of multiple machines... > I/O overhead is not all that significant, and if you're running > spamc/spamd on the same machine, communicating over the local loopback > TCP interface, your OS is responsible for making sure that's done > efficiently. If it's much slower than using a shell pipe, you need a > new OS. :) > > I don't even plan to implement 100% compatible alternative, i'll probably > > leave some checks out. At least now i won't implement all that eval's in C > > and will leave all network tests (they can be done by the MTA if needed). > > Well, they can't really be done by the MTA in most cases, unless you > have a really fancy MTA. The network checks are not done in most cases > against the envelope contents (which is what MTAs normally check), but could you explain tis in more detail? 9or point me to the RTFM, i couldn't find the doc/ dir mentioned in README) > against the email header information. Also, they do things like razor > checking, etc. which MTAs don't normally do. I'm not saying any of this but procmail can... (assuming there is a razor client somewhere) > is critical, and I run spamd -L anyway. me too. anyway, i have another idea, "developed" some time ago for virus checking. the main point is checking mails for databases (razor/virusscanner) at time of _downlaoding_ mail (pop3), instead of when it arrives. (when it arrives (especially true for viruses) it's mostly unknown by databases, but few hours/days later when teh user downloads it it's already listed in the database and can be cought. anyway it has many limitations at pop3 side, and needs special pop3 daemon (which at first time of mail checking checks and move new mails to a new folder where the user can download from). > > Anyway I have some idea to speed up this thingie even more. > > For example, doing the regexp checks at 2-3 passes. First the cheap (fast) > > ones, and the big (high score) ones. Then if score <=0 i'll stop and retur > n > > NO_SPAM. If there was any positive score at first pass, i'll continue with > > expensive (slower) checks. It will probably lower check quality a little b > it > > (more possible false report) but increase performance a lot (n times). > > This is probably not a bad idea. It would be useful to share work here > probably and backport this to the perl versions. agree. anyway it will took long to finetune it to get a good enough balance between performance and quality/efficiently. > > For this i have to do some statistics analyzing of negative/positive hits > on > > a big enough spam and nospam collection. It will also slow the less used > > checks, they may be left out, or at least moved to 2nd/3rd pass... > > You should have the corpus access instructions now :) yes, i'm just downloading it, thanks. i don't need no-spam data, i have access to enough :) anyway i'll make my spam collection public for you, maybe usefull. (but i have to clean it first, there are still a few non-spam mails (false hits) there). btw i've got postmaster's mailbox and after some filtering got ~3200 new spam mails (they were bounced to postmaster, due to removed/nonexistant user accounts). > > Developer of MPlayer, the Movie Player for Linux - http://www.MPlayerHQ.hu > I love mplayer :) thx :) A'rpi / Astral & ESP-team -- Developer of MPlayer, the Movie Player for Linux - http://www.MPlayerHQ.hu _______________________________________________ Spamassassin-talk mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/spamassassin-talk