Finally got the kinks worked out in my SA-3.1 setup last week. Filtered
out over 420 spams -- maybe 1 false positive, and it was borderline.
The speed on sa-learn has dropped, but that may be unavoidable. But
I'm finally getting >= spam recognition than I had in 2.63.
I have no-online tests enabled as the online test databases are going
the way of "cddb"...becoming privatized. Sorta sad...maybe time to
start a "freezor" or some similar services. I mean the spam services
collect data about what is spam from users who use the database. Without
the users, they woudn't be nearly as effective. Yet the users then are
encouraged to pay to access the body of data that was previously donated
for free.
I suppose one could look at the cost of "aggregation" and intelligent
processing of 1000's of user-spam inputs into a usable output format,
and while it might be manageable for a small community of users, it's
not so manageable if the database starts being used by a much larger
user-base than the original system was designed to run on.
Still -- I have yet to look at what is needed to convert my "db"s into
SQL form -- been sorta busy: car got crashed into last week and
was told this week it's totalled, that and was informed Tuesday
of a need for a root canal, on Wednesday, informed of need for 2nd
root canal & oral surgery. *smile* Life is just so _*!%fun!*%)_.
So am a bit behind in being on top of my ->SQL based conversion (I'm
assuming i'm in an older format. I just ran the convert tool to convert
from 2.x format to 3.x.
Assuming it is some sort of berkeley db format, what is a good
cut-over size as a "rule-of-thumb"...or is there? What should I
expect in speeds for "sa-learn" or spamc? I.e. -- is there a
rough guideline for when it becomes more effective to use SQL
vs. the Berkeley DB? Or rephrased, when it is worth the effort to
convert to SQL and ensure all the SQL software is setup and running?
Thanks...and thanks for the help/patience
BTW -- maybe this should go to the "sa-dev" list, but an RFE:
"spamassassin --lint":
1) would be nice to mention if daemon is _RUNNING_ and ready
to process messages; (user error: forgetting to restart daemon and
seeing no "--lint" message hinting that the daemon isn't running and
ready to process incoming mail--*duh*)
2) Would be nice, especially in "--lint" to check for bogus
lock files left around in spam DB dir. I don't know when these files
are used, but their presence really slows down sa-learn by about a
factor of 4-6x.
"sa-learn":
1) RFE: have sa-learn issue warning about pre-existing lock-files,
or, better, auto-remove bogus locks for processes that no longer exist.