I noticed that buglist.cgi was taking quite a bit of CPU time. I looked at some of the long running instances, and they were coming from searchbots. I can't think of a good reason for this, so I have committed this patch to the gcc.gnu.org robots.txt file to not let searchbots search through lists of bugs. I plan to make a similar change on the sourceware.org and cygwin.com sides. Please let me know if this seems like a mistake.
Does anybody have any experience with http://code.google.com/p/bugzilla-sitemap/ ? That might be a slightly better approach. Ian
Index: robots.txt =================================================================== RCS file: /cvs/gcc/wwwdocs/htdocs/robots.txt,v retrieving revision 1.9 diff -u -r1.9 robots.txt --- robots.txt 22 Sep 2009 19:19:30 -0000 1.9 +++ robots.txt 13 May 2011 17:08:33 -0000 @@ -5,4 +5,5 @@ User-Agent: * Disallow: /viewcvs/ Disallow: /cgi-bin/ +Disallow: /bugzilla/buglist.cgi Crawl-Delay: 60