Hi Scott

I used a robots list in the past in combination with mod_rewrite on apache
=> RewriteMap which lookup a list I got from
http://fantomaster.com/fa_SE-LIST.html and other sources.
This is very useful for cloacking :-)
Anyway there different approach to it.
Here are few ideas
a) You could also use Valves or ServeltFilter [or existing Free
implementation of mod_rewrite for Servlet containers or IIS] to log the
robots separately.

b) Also you will find that most Log Anlaysers will provide with some
reports on spider/robots , and you can run them against your existing logs
and see their activities.

c) Instead of maintaining a list of robots you can maitain a list of Real
Web Clients [ie, moz, ns, ..] and do the diff on the logs with a perl
script which will separate them into different logs. then run your log
analyser against it.

there are web sites and newsletters dedicated to this subject as well as
books such as "Google hacks".

Hopes this helps.

With Best Regards
Bruno Georges

Glencore International AG
Tel. +41 41 709 3204
Fax +41 41 709 3000


|---------+--------------------------->
|         |           "Scott Purcell" |
|         |           <[EMAIL PROTECTED]|
|         |           r.net>          |
|         |                           |
|         |           23.03.06 01:59  |
|         |           Please respond  |
|         |           to "Tomcat Users|
|         |           List"           |
|         |                           |
|---------+--------------------------->
  
>---------------------------------------------------------------------------------------------------------------------|
  |                                                                             
                                        |
  |        To:      "Tomcat Users List" <users@tomcat.apache.org>               
                                        |
  |        cc:                                                                  
                                        |
  |        Subject: Would like to track googlebots, or spiders from site        
                                        |
  |                                                                             
                                        |
  |Distribute:                                                                  
                                        |
  |Personal?               |-------|                                            
                                        |
  |                        | [ ] x |                                            
                                        |
  |                        |-------|                                            
                                        |
  |                                                                             
                                        |
  
>---------------------------------------------------------------------------------------------------------------------|




Not necessarily a tomcat problem, but I have created a webapp on Tomcat
5.x. I would like to be able to track any robot activity in a log file, but
not really sure where to begin looking for this functionality? If this is
possible, is there  a debug level that could be used so I could find out
what pages it hits, etc? Or is this something that needs to be written and
tired into the container. This is Tomcat standalone, no Apache front end.


If anyone has this knowledge, links, etc. I would certainly appreciate.

Regards
Scott




LEGAL DISCLAIMER. The contents of this e-mail and any attachments are strictly
confidential and they may not be used or disclosed by someone who is not a
named recipient.
If you have received this email in error please notify the sender by replying
to this email inserting the word "misdirected" as the message and delete this
e-mail from your system.


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to