On 05/16/2011 02:42 PM, Richard Guenther wrote: > On Mon, May 16, 2011 at 3:34 PM, Andrew Haley <a...@redhat.com> wrote: >> On 05/16/2011 02:32 PM, Michael Matz wrote: >>> >>> On Mon, 16 May 2011, Andrew Haley wrote: >>> >>>>> It routinely is. bugzilla performance is terrible most of the time >>>>> for me (up to the point of five timeouts in sequence), svn speed is >>>>> mediocre at best, and people with access to gcc.gnu.org often observe >>>>> loads > 25, mostly due to I/O . >>>> >>>> And how have you concluded that is due to web crawlers? >>> >>> httpd being in the top-10 always, fiddling with bugzilla URLs? >>> (Note, I don't have access to gcc.gnu.org, I'm relaying info from multiple >>> instances of discussion on #gcc and richi poking on it; that said, it >>> still might not be web crawlers, that's right, but I'll happily accept >>> _any_ load improvement on gcc.gnu.org, how unfounded they might seem) >> >> Well, we have to be sensible. If blocking crawlers only results in a >> small load reduction that isn't, IMHO, a good deal for our users. > > I for example see also > > 66.249.71.59 - - [16/May/2011:13:37:58 +0000] "GET > /viewcvs?view=revision&revision=169814 HTTP/1.1" 200 1334 "-" > "Mozilla/5.0 (compatible; Googlebot/2.1; > +http://www.google.com/bot.html)" (35%) 2060117us > > and viewvc is certainly even worse (from an I/O perspecive). I thought > we blocked all bot traffic from the viewvc stuff ...
It makes sense to block viewcvs, but I don't think it makes as much sense to block the bugs themselves. That's the part that is useful to our users. Andrew.