Hi Dave, Thanks for the revelation. This clears a lot of things now. After I disabled all the tests in 50_scores.cf, I was still seeing a lot of CPU (a very heavily loaded server). This pretty much clears everything.
You guys have been of commendable help. One last thing, so is there a way or a hack to disable rules under /var/lib/spamassassin/3.003002/updates_spamassassin_org, in a way that it survives sa-updates? Is there a patch in the wild that allows me to do that? On Tue, Sep 17, 2013 at 12:57 AM, Dave Funk <dbf...@engineering.uiowa.edu> wrote: > That's because SA no longer ships with rules in the source kit. > First thing you do after a new install is run sa-update to download > a set of rules and those go into a seperate directory. > > Do this: > > spamassassin --lint -D 2>&1 | grep dir > > And you should see things like: > > Sep 16 14:21:39.457 [27354] dbg: config: using > "/var/lib/spamassassin/3.003001" for default rules dir > Sep 16 14:21:39.460 [27354] dbg: config: using "/etc/mail/spamassassin" for > site rules dir > > That will tell you what directory trees contain the rules files > that -your- SA kit is using. > > Now if you do: > spamassassin --lint -D 2>&1 | grep 'config:' > > it will tell you all the rules files that it's processing > (and a bunch of other stuff too). > > > > On Mon, 16 Sep 2013, Abhijeet Rastogi wrote: > >> Hi John, >> >> Did a >> >> $grep -inr __HAS_SENDER ./ >> >> in the source. No hits, what-so-ever. >> >> On Mon, Sep 16, 2013 at 11:37 PM, John Hardin <jhar...@impsec.org> wrote: >>> >>> On Mon, 16 Sep 2013, Abhijeet Rastogi wrote: >>> >>>> Hi John, >>>> >>>> I'm sure you're pretty clear on explaining it but as a newbie I'm >>>> facing issues. My concern still exists. I could see loglines like: >>>> >>>> Sep 16 14:41:51.607 [3999] dbg: rules: ran header rule __HAS_SENDER >>>> ======> got hit: "<YES>" >>>> Sep 16 14:41:51.606 [3999] dbg: rules: ran header rule __HAS_TO >>>> ======> got hit: "<YES>" >>>> Sep 16 14:41:51.605 [3999] dbg: rules: ran header rule __HAS_ERRORS_TO >>>> ======> got hit: "<YES>" >>>> Sep 16 14:41:51.604 [3999] dbg: rules: ran header rule __HAS_XMAIL >>>> ======> got hit: "<YES>" >>>> >>>> But, I don't see that defined anywhere (grepped them against 3.3 >>>> version from svn). Also, in the install (CentOS5), I couldn't find it >>>> either (Checked both /usr/share/spamassassin and >>>> /etc/mail/spamassassin). So, what's the deal with these, where are >>>> these defined? >>> >>> >>> >>> Did you search subdirectories as well? sa-update updates rules into >>> subdirectories. >>> >>> >>>> I would really appreciate a reply here. Thanks >>>> >>>> >>>> On Mon, Sep 16, 2013 at 10:46 PM, John Hardin <jhar...@impsec.org> >>>> wrote: >>>>> >>>>> >>>>> On Mon, 16 Sep 2013, Abhijeet Rastogi wrote: >>>>> >>>>>> Hi John, >>>>>> >>>>>> Thanks for the reply. I could get the above said rule as a "meta" one. >>>>>> Thanks for that. >>>>> >>>>> >>>>> >>>>> >>>>> Apologies if that came across as condescending, I was just trying to be >>>>> thorough. >>>>> >>>>> >>>>>> One more thing I was hoping you could help me with. >>>>>> >>>>>> Can you explain as to what's the difference between rules under >>>>>> "./rules" and under "./rulesrc/sandbox/" directory? >>>>> >>>>> >>>>> >>>>> >>>>> The "rules" directory is rules that are published no matter what. >>>>> That's >>>>> the >>>>> "static" ruleset, the base rules that everything can depend on to be >>>>> present, and rules that have always performed well. >>>>> >>>>> The stuff under the sandbox directories is more dynamic. The rules >>>>> there >>>>> are >>>>> run through the nightly masscheck process, and if they perform well >>>>> enough >>>>> they get published. They're more "dynamic", in that older rules that >>>>> stop >>>>> performing well against current spam (as represented by the masscheck >>>>> corpora) may stop being published, and may automatically start being >>>>> published again if the corpora starts containing that type of spam >>>>> again. >>>>> >>>>> >>>>>> The reason I want to know this is because I've a requirement where I >>>>>> want >>>>>> to disable everything (meaning *all* rules) except a locally hosted >>>>>> URIBL). >>>>>> I was hoping that I could do this by adding the output of the below >>>>>> command. >>>>>> (running in the source code). >>>>>> >>>>>> cat rules/*.cf | grep -E '^(header|body)' | awk '{print $2}' | sed >>>>>> 's/^/score /' | sed 's/$/ 0/' >>>>>> >>>>>> But, to my surprise, it didn't help. I still had various checks stull >>>>>> getting applied like __HAS_TO, __HAS_ERRORS_TO etc etc. Any idea as to >>>>>> what can be done about that? >>>>> >>>>> >>>>> >>>>> >>>>> As __subrules don't have a score, their execution cannot be disabled by >>>>> setting their score to zero. >>>>> >>>>> If you want to override the default behavior of SA to that degree, it's >>>>> easier to change the config directory that SpamAssassin and spamd use >>>>> with >>>>> the -c option, so that none of the base rule files are read in the >>>>> first >>>>> place. You'll need to provide some minimal set of config files in >>>>> whatever >>>>> custom config directory you specify in order to get SA to run, but that >>>>> will >>>>> avoid all the extra default stuff you don't seem to want. >>>>> >>>>> I've never customized SA to that degree so there may be some pitfalls >>>>> in >>>>> this that I'm not aware of - somebody else will probably say something >>>>> if >>>>> there's other stuff you need to be aware of. >>>>> >>>>> >>>>> >>>>>> On Mon, Sep 16, 2013 at 10:07 PM, John Hardin <jhar...@impsec.org> >>>>>> wrote: >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Mon, 16 Sep 2013, Abhijeet Rastogi wrote: >>>>>>> >>>>>>>> Problem is, how do I know that a certain rule like __RCVD_IN_NJABL >>>>>>>> is >>>>>>>> a base rule for others? >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> The two leading underscores in the rule name indicate that the rule, >>>>>>> by >>>>>>> itself, is not assigned a score, thus, by itself, does not affect the >>>>>>> overall score of the message at all. It must appear in a "meta" rule, >>>>>>> possibly with other rules, before it can be assigned a score and >>>>>>> affect >>>>>>> the >>>>>>> overall message score. So, at the most basic level, any rule having a >>>>>>> name >>>>>>> that starts with two underscores is _inherently_ a base for other >>>>>>> rules. >>>>>>> >>>>>>> In order to determine *which* rules it's a base for, you have to look >>>>>>> for >>>>>>> that rule name in the config files. This isn't too easy to do online, >>>>>>> you >>>>>>> pretty much have to grep the rules files in a local install. >>> >>> >>> >>> -- >>> John Hardin KA7OHZ http://www.impsec.org/~jhardin/ >>> jhar...@impsec.org FALaholic #11174 pgpk -a jhar...@impsec.org >>> key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79 >>> ----------------------------------------------------------------------- >>> WSJ on the Financial Stimulus package: "...today there are 700,000 >>> fewer jobs than [the administration] predicted we would have if we >>> had done nothing at all." >>> >>> ----------------------------------------------------------------------- >>> Tomorrow: the 226th anniversary of the signing of the U.S. Constitution >> >> >> >> >> > > -- > Dave Funk University of Iowa > <dbfunk (at) engineering.uiowa.edu> College of Engineering > 319/335-5751 FAX: 319/384-0549 1256 Seamans Center > Sys_admin/Postmaster/cell_admin Iowa City, IA 52242-1527 > #include <std_disclaimer.h> > Better is not better, 'standard' is better. B{ -- Regards, Abhijeet Rastogi (shadyabhi) http://blog.abhijeetr.com