Hey Justin, Fuzzy Fox suggested a similar route. The Bayes token is a great possibility. The tokens in this case would be time rather than words.
One way to accomplish this task is to just give local.cf assignments that would score during a specific time interval. This would allow the administrators to adjust the time interval and score. The only thing the administrator needs to know is their spam/ham flow over time. However, if one tries to automate this then thresholds need to be attained automatically. So even if Bayes learns the time, a filtration engine needs to be able to analyze the spam/ham over time. A time minimum time interval needs to be met before thresholds can be attained. I would venture a guess that the minimum needs to be a week. One might need to learning distributions such as hours for Mon-Fri and hours for Sat-Sun. It is easy for a human to look at the graph (http://www.gryzor.com/tools/spamstats-pics.html) and have the administrator make assumptions and just enable this test, set the time interval, and be able to override the default score. However, if making this automatic one needs to create a means of describing the spam bandwidth (variation in message numbers over a 24 hour time period), ham bandwidth, average high spam count to create an upper threshold, average low spam count to create a lower threshold, etc. What I am saying is that one really needs to describe the ham/spam flow. This quickly becomes a signal analysis problem. Why? Because if we don't describe the singals (ham/spam flow over time) then we can quickly run into false positives. This is especially true for organizations that have ham that more closely looks like spam flow. A global organization that receives ham continuously throught the day would have mail flow that looks very different from the provided graph. Does anyone agree with this or am I out to lunch? Please don't get me wrong. I think that if this can be done it would really be great! I would really love to see this done automatically and Bayes tokens might just be the way to do this. I think the graph provided is probably more true for organizations (and maybe most individuals) than not. However, there is no guarantee as there is not enough data to support any conclusion. I do believe that time would be a great indicator and that the proper implementation is crucial for success. If you decide to persue this I would love to help! I remember some of my signal analysis classes but my associated math knowledge has waned over time. Still, great problem and potentially a great test. --Larry > -----Original Message----- > From: [EMAIL PROTECTED] > Larry Gilson writes: > >I believe there is a big problem awaiting those who > >generalize one graph. This graph will most definitely change from > >organization to organization. The pattern, however may be > >similar. This poses a huge problem when trying to develop a > >general rule for the SA community. > > It may get good results if made into a token for the Bayes scanner. ------------------------------------------------------- This SF.Net email sponsored by: Free pre-built ASP.NET sites including Data Reports, E-commerce, Portals, and Forums are available now. Download today and enter to win an XBOX or Visual Studio .NET. http://aspnet.click-url.com/go/psa00100003ave/direct;at.aspnet_072303_01/01 _______________________________________________ Spamassassin-talk mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/spamassassin-talk