Larry Nedry writes:
> Hi All,
> 
> I'm trying to use mass-check to test the accuracy of a plugin that I'm
> developing.  If I run mass-check without the -j option (single process) it
> takes a few hours for it to finish a corpus of about 60,000 emails.  If I
> use the --net option it could a day or two to complete.  Of course if I run
> it with the -j option it is much faster but almost always mass-check will
> hang at a seemingly random place.  I've seen it hang at less than 5%
> complete and a few times it got as far as 98% complete.  And it doesn't
> matter if -j=2 or -j=48, it still hangs.
> 
> Once it hangs I can let it sit for hours without seeing any network, disk
> or CPU activity.  I still have plenty of free memory so swapping is not the
> issue.
> 
> Are others running into this problem?  Is this a bug in mass-check?  Is
> there a newer (fixed) version that will work with SA 3.1.18?  Or am I
> missing something important?
> 
> My setup:
>     Mac Pro Quad Xeon 3.0 Ghz
>     Fedora Core 4 or Mac OS X 10.4.8 (same results)
>     5 GB RAM
>     SpamAssassin 3.1.18
> 
> Directory layout:
>     SA3.1.18/rules/
>     SA3.1.18/masses/
>     SA3.1.18/masses/ham/      (corpora)
>     SA3.1.18/masses/spam/     (corpora)
> 
> My Command line:
> # ./mass-check --progress --noisy -c=../rules spam:mbox:./spam ham:mbox:./ham

Could you try adding --restart=1000?

Also, could you try with the rc1 of SpamAssassin 3.2.0, or SVN trunk?
I think Theo fixed bugs in this code.

> I've seen the same problem running under both Fedora Code 4 and Mac OS X
> 10.4.8.
> 
> I'm currently using just the default rules that are in the ../rules folder.

It might be worth trimming this down to see if it can be reproduced with
a smaller ruleset -- it'd run faster at least ;)

> What is the purpose of the mass_prefs file?
> Am I supposed to edit the mass-check.cf file?

This is equivalent to the "user_prefs" file when you're running
SpamAssassin normally -- so you can do stuff like add "use_bayes 0"
or "trusted_networks 1.2.3/24" for local configuration or turning
off stuff you don't need.

--j.

Reply via email to