hits=-2.5 tests=BAYES_00,FORGED_RCVD_HELO X-USF-Spam-Flag: NO Hi,
I have a requirement to scan Apache logs and discover ``exceptions''. Exceptions can be of two types: 1. A single IP generating a large amount of traffic within a given time frame (for definable values of ``large'' and ``time frame''). 2. A single IP hitting a wide set of URLs on the server (indicates a crawler), again for definable values of ``wide''. I'm a complete newbie to R (and to statistics), so the questions are: - Can R help me generate graphs which would help me identify these activities? - Has someone already done something like this? If so, where could I find it? - If not, can someone help me with the stats (and R) part to help me achieve these objectives? Any software that gets created as a result would be released under a FOSS license. Data massaging, tuning, etc. are not an issue. We'd be dealing with a few hundred thousand or a million records a day. Regards, -- Raju -- Raj Mathur [EMAIL PROTECTED] http://kandalaya.org/ Freedom in Technology & Software || February 2008 || http://freed.in/ GPG: 78D4 FC67 367F 40E2 0DD5 0FEF C968 D0EF CC68 D17F PsyTrance & Chill: http://schizoid.in/ || It is the mind that moves ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.