On Thu, 2002-10-24 at 14:36, Rich Wellner wrote:
> I was writing this note and put in that a histogram might be useful and that a
> couple people had already asked, so I decided to put one together instead of
> continuing to talk about it.
> 
> You can see a sample from my mail: http://wellner.org/histplot.png

After an email failure (MS Exchange variety) I told my email client to
stop pulling my inbox.  After getting no spamassassin (or debian-user)
emails in several days, I remembered that I didn't turn in back on. 
That's my excuse for not replying sooner. :-)

I'm working on a sitewide spamassassin gateway server, and I'm now at
the stage of gathering data for fine-tuning info.  A histogram is one of
those things.  I've got a PHP imaging script which updates the
histogram, and a web-frontend to change parameters.  Check out
http://gateway1.oc.edu/spam

Here's my totally uncommented raw PHP script.  You can get the frontend
page by viewing the page source from the URL above.


---- BEGIN inline_image.php -----
<?php
#############################################################################################
function syslog_to_unix_time ($syslog_string) {
        $months = array("Jan" => 1, 
                        "Feb" => 2, 
                        "Mar" => 3, 
                        "Apr" => 4,
                        "May" => 5,
                        "Jun" => 6,
                        "Jul" => 7,
                        "Aug" => 8,
                        "Sep" => 9,
                        "Oct" => 10,
                        "Nov" => 11,
                        "Dec" => 12);
        $syslog_array = split(" ",$syslog_string);
        $month = $months[$syslog_array[0]];
        $day   = $syslog_array[1];
        $syslog_array = split(":",$syslog_array[2]);
        return
mktime($syslog_array[0],$syslog_array[1],$syslog_array[2],$month,$day,date("Y"));
        
}
############################################################################################

$width  = !is_null($_GET['width']) ? $_GET['width'] : 700;
$height = !is_null($_GET['height'])? $_GET['height']: 450;
$high   = !is_null($_GET['high'])  ? $_GET['high']  : 15.0;
$low    = !is_null($_GET['low'])   ? $_GET['low']   :  5.0;
$step   = !is_null($_GET['step'])  ? $_GET['step']  :   .5;
$time   = !is_null($_GET['time'])  ? $_GET['time']  : "week";

$today = time();
switch ($time) {
        case 'day' : $evaltime = strtotime("-1 day"); break;
        case 'week' : $evaltime = strtotime("-7 days"); break;
        case 'month' : $evaltime = strtotime("-1 month"); break;
        default : $evaltime = strtotime("30 August 2002"); break;
}
include("/usr/lib/phplot/phplot.php");
$graph = new PHPlot($width,$height);

$log_dir = opendir("/var/log");
system ("rm -rf /tmp/mailtest");

$greps = "| grep 'spamd' | grep '(' | grep -v 'running' | grep -v
'unknown' >> /tmp/mailtest";
while ($filename = readdir($log_dir)) {
        if (substr($filename,0,8) == "mail.log") {
                if (substr($filename,count($filename)-4,3) == ".gz") {
                        system("zcat /var/log/$filename $greps");
                }
                else
                {
                        system("cat /var/log/$filename $greps");
                }
        }
}
closedir($log_dir);

$log = fopen("/tmp/mailtest", "r");
$curtime = array();
$timescores = array();
$valuesarr = array();
$timescores = file("/tmp/mailtest");
fclose($log);

foreach ($timescores as $line) {
        if ($timescores = preg_split("/(\(|\))/",$line)) {
                $timescores[0] = substr($timescores[0],0,15);
                $curtime = syslog_to_unix_time($timescores[0]);
                if ($curtime >= $evaltime) {
                        array_push($valuesarr,$timescores[1]);
                }
        }
}

sort($valuesarr,SORT_NUMERIC);

$example_data = array();

foreach ($valuesarr as $num) {
        // is $num in $example_data?
        $num = $num * 10.0;
        while ($num % ($step*10) != 0) { $num--;}
        $num = $num / 10.0;

        $found = false;
        if ($num < $high && $num > $low) {
                for ($i = 0; $i < count($example_data); $i++) {
                        if ($num == $example_data[$i][0]) {
                                $example_data[$i][1]++;
                                $found = true;
                        }
                }
        
                if (!$found) {
                        array_push($example_data,array($num,1));
                }
        }
}

$graph->SetTitle("$which_title");
$graph->SetDataValues($example_data);
$graph->SetIsInline("1");
$graph->SetFileFormat("$file_format");

$graph->SetBackgroundColor(array(200,200,200));
$graph->SetLabelColor(array(204,102,0));
$graph->data_color = array('maroon');
$graph->SetPlotType("bars");
$graph->SetXDataLabelAngle(90);
$graph->SetPrecisionY(0);


$graph->SetYLabel("Num of Emails");
$graph->SetXLabel("\nDegree of Spam");

$graph->DrawGraph();
?>
---- END inline_image.php -------

Jeremy

-- 
**************************************************************
Jeremy Turner, Help Desk Supervisor        Phone: 405.425.5555
Email: [EMAIL PROTECTED]                Phone: 405.425.1820
Information Technology Services, Oklahoma Christian University
**************************************************************
Linux jturnermac 2.4.19 #10 Mon Oct 14 13:14:58 CDT 2002 ppc GNU/Linux
 15:10:01 up 1 day,  1:40,  1 user,  load average: 0.38, 0.15, 0.04



-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf
_______________________________________________
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Reply via email to