hits=-2.5 tests=BAYES_00,FORGED_RCVD_HELO
X-USF-Spam-Flag: NO

Hi,

I have a requirement to scan Apache logs and discover ``exceptions''.  
Exceptions can be of two types:

1. A single IP generating a large amount of traffic within a given time frame 
(for definable values of ``large'' and ``time frame'').

2. A single IP hitting a wide set of URLs on the server (indicates a crawler), 
again for definable values of ``wide''.

I'm a complete newbie to R (and to statistics), so the questions are:

- Can R help me generate graphs which would help me identify these activities?

- Has someone already done something like this?  If so, where could I find it?

- If not, can someone help me with the stats (and R) part to help me achieve 
these objectives?  Any software that gets created as a result would be 
released under a FOSS license.

Data massaging, tuning, etc. are not an issue.  We'd be dealing with a few 
hundred thousand or a million records a day.

Regards,

-- Raju
-- 
Raj Mathur                [EMAIL PROTECTED]      http://kandalaya.org/
 Freedom in Technology & Software || February 2008 || http://freed.in/
       GPG: 78D4 FC67 367F 40E2 0DD5  0FEF C968 D0EF CC68 D17F
PsyTrance & Chill: http://schizoid.in/   ||   It is the mind that moves

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to