>surely, it makes no sense blow up the database with already 100% >classified samples - you even don't do that uncnditional with a >hand-trained database (at least not forever, at the begin it makes sense >to get additional tokens)
I think you misunderstood my question. I meant that as I look at messages to see if I think they should have learned, or not, would one that shows 90-100 spam likely per bayes likely be one that is being skipped due to already being recognized. I am not asking why isn't it learning one like that. Or maybe I misunderstood your answer. >but train every single message which is already classified as expected >would leat to a lot of useless load. blows up the database and makes >bayes-poisioning and the need to purge the whole database and start from >scratch (with thanks to autotraining no available corpus) then >autolearning on it's down does Agree. And I understand that is not how it is designed. >the question of bayes-poisioning is not "if", it's "when and how often" >and hence after 10 years expierience i stopped that nonsense and keep a >currently 120000 messages large corpus of eml-files (HAM AND SPAM) Not arguing the pros and cons of IF one should use it. I only want to make it work, or better said, verify that it IS working. Then I can decide if I want to keep using it. Right now, I've never seen it work. Thus my strong suspicion that is is not working. One thing for sure, it hasn't found a single spam or ham to auto-learn, yet. Which seems unlikely if it were functioning properly. The output of "unavailable" is too ambiguous for me to devise a way to troubleshoot. But I'm not an expert with SA. Thus the plea for assistance in seeing if it is working. If auto-learn isn't working, my expectation is that auto-anything-else isn't working either. Journal maint, etc.