Hi, I have a simple system running ok except when google's web crawlers start trying to find some old, long url's which used to be on the site. I have just migrated an old basically static website to web2py, to provide a base for some more interesting features in the future. However google knows the old url's and tries to crawl them, at which point the system dies by going into a tight loop.
It is quite repeatable on my development machine, now I know which url's trigger it. An example is: http://localhost:63123/aaaaaaaaaa/Abbbbbbbb%20Lccc%20-%20Pddddddd%20GA%20Deeeeee%20(ffff%20ffff%20A).pdf If I remove the two brackets in the final part of the url (the pdf file name) so the url becomes http://localhost:63123/aaaaaaaaaa/Abbbbbbbb%20Lccc%20-%20Pddddddd%20GA%20Deeeeee%20ffff%20ffff%20A.pdf then I get "invalid function (default/aaaaaaaaaa)" as I would expect. I know the brackets are invalid characters and should not be in the uri (or should be encoded), but the system should be robust against invalid characters being sent to the server. I am running on web2py 2.2.1. I am wondering how to debug this further. If I turn on DEBUG logging for root, rewrite, web2py and rocket I get this output: .... 2012-11-13 13:05:57,934 - Rocket.Errors.ThreadPool - DEBUG - Examining ThreadPool. 10 threads and 0 Q'd conxions 2012-11-13 13:05:58,936 - Rocket.Errors.ThreadPool - DEBUG - Examining ThreadPool. 10 threads and 0 Q'd conxions 2012-11-13 13:05:59,600 - Rocket.Errors.Thread-3 - DEBUG - Received a connection. 2012-11-13 13:05:59,600 - Rocket.Errors.Thread-3 - DEBUG - Serving a request 2012-11-13 13:05:59,601 - Rocket.Errors.Thread-3 - DEBUG - Getting sock_file select application=tgaa route: controller=default route: function.ext=aaaaaaaaaa.html and now the system is in its tight loop with no more logging output. routes.py contains just logging = 'print' routers = dict( BASE = dict( default_application = 'tgaa', map_hyphen = 'True', ), ) Thanks for any assistance. --