>> Hmm... that sounds like an idea which was brought on some >> time ago (John was still the dev for ASSP at the time); that >> is, set up some kind of TTL parameter for corpus files so >> that the spamdb rebuild should check the file date/time and >> if over the TTL (say "n" days) it should then delete the file.
> My thought is that the "TTL" would only be in effect for the purpose > of keeping BlockReporting working (for however many days or > weeks you wish the emails to be guaranteed resendable). > After that time, the TTL is null and the files are game for > replacement. I thought it a simple idea for working around > the BlockReporting problem Thomas mentioned. I see, but there's no need to store something along with files, the regular filesystem timestamp for each file will just work fine, just remove all files if "(today - filetime) > TTL" > On a low-to-medium traffic box, though, this would not be a > problem. We already deal with bunches of identical > messages from time-to-time (nothing new). there may be a solution for that too, assuming the spam and notspam folders gets cleaned up using the TTL, the files may be saved using (e.g.) an MD5 hash (or the like) as the name so that identical messages won't be stored more than one time; by the way that may have some side effects and may need some more thinking but... >> Bottom line; the bayes filter should work by /learning/ this >> means that it should NOT discard the previous data, but >> rather REFINE them from further data coming in; so maybe the >> whole bayes approach used inside ASSP should be revised NOT >> to deal just with the latest data but to learn/improve during time > Just an idea, but how do you "NOT" discard data while keeping > rebuild times low and maintaining free hard drive space > (realistically)? Using some kind of "digest" of the previous bases stored in a more compact format ------------------------------------------------------------------------------ Come build with us! The BlackBerry® Developer Conference in SF, CA is the only developer event you need to attend this year. Jumpstart your developing skills, take BlackBerry mobile applications to market and stay ahead of the curve. Join us from November 9-12, 2009. Register now! http://p.sf.net/sfu/devconf _______________________________________________ Assp-test mailing list Assp-test@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/assp-test