Larry Nedry writes: > Hi All, > > I'm trying to use mass-check to test the accuracy of a plugin that I'm > developing. If I run mass-check without the -j option (single process) it > takes a few hours for it to finish a corpus of about 60,000 emails. If I > use the --net option it could a day or two to complete. Of course if I run > it with the -j option it is much faster but almost always mass-check will > hang at a seemingly random place. I've seen it hang at less than 5% > complete and a few times it got as far as 98% complete. And it doesn't > matter if -j=2 or -j=48, it still hangs. > > Once it hangs I can let it sit for hours without seeing any network, disk > or CPU activity. I still have plenty of free memory so swapping is not the > issue. > > Are others running into this problem? Is this a bug in mass-check? Is > there a newer (fixed) version that will work with SA 3.1.18? Or am I > missing something important? > > My setup: > Mac Pro Quad Xeon 3.0 Ghz > Fedora Core 4 or Mac OS X 10.4.8 (same results) > 5 GB RAM > SpamAssassin 3.1.18 > > Directory layout: > SA3.1.18/rules/ > SA3.1.18/masses/ > SA3.1.18/masses/ham/ (corpora) > SA3.1.18/masses/spam/ (corpora) > > My Command line: > # ./mass-check --progress --noisy -c=../rules spam:mbox:./spam ham:mbox:./ham
Could you try adding --restart=1000? Also, could you try with the rc1 of SpamAssassin 3.2.0, or SVN trunk? I think Theo fixed bugs in this code. > I've seen the same problem running under both Fedora Code 4 and Mac OS X > 10.4.8. > > I'm currently using just the default rules that are in the ../rules folder. It might be worth trimming this down to see if it can be reproduced with a smaller ruleset -- it'd run faster at least ;) > What is the purpose of the mass_prefs file? > Am I supposed to edit the mass-check.cf file? This is equivalent to the "user_prefs" file when you're running SpamAssassin normally -- so you can do stuff like add "use_bayes 0" or "trusted_networks 1.2.3/24" for local configuration or turning off stuff you don't need. --j.