Re: Don't let search bots look at buglist.cgi

Richard Guenther Mon, 16 May 2011 06:10:26 -0700

On Mon, May 16, 2011 at 3:04 PM, Andrew Haley <a...@redhat.com> wrote:
> On 05/16/2011 01:09 PM, Michael Matz wrote:
>> Hi,
>>
>> On Mon, 16 May 2011, Andrew Haley wrote:
>>
>>> On 16/05/11 10:45, Richard Guenther wrote:
>>>> On Fri, May 13, 2011 at 7:14 PM, Ian Lance Taylor <i...@google.com> wrote:
>>>>> I noticed that buglist.cgi was taking quite a bit of CPU time.  I looked
>>>>> at some of the long running instances, and they were coming from
>>>>> searchbots.  I can't think of a good reason for this, so I have
>>>>> committed this patch to the gcc.gnu.org robots.txt file to not let
>>>>> searchbots search through lists of bugs.  I plan to make a similar
>>>>> change on the sourceware.org and cygwin.com sides.  Please let me know
>>>>> if this seems like a mistake.
>>>>>
>>>>> Does anybody have any experience with
>>>>> http://code.google.com/p/bugzilla-sitemap/ ?  That might be a slightly
>>>>> better approach.
>>>>
>>>> Shouldn't we keep searchbots way from bugzilla completely?  Searchbots
>>>> can crawl the gcc-bugs mailinglist archives.
>>>
>>> I don't understand this.  Surely it is super-useful for Google etc. to
>>> be able to search gcc's Bugzilla.
>>
>> gcc-bugs provides exactly the same information, and doesn't have to
>> regenerate the full web page for each access to a bug report.
>
> It's not quite the same information, surely.  Wouldn't searchers be directed
> to an email rather than the bug itself?


Yes, though there is a link in all mails.

Richard.

Re: Don't let search bots look at buglist.cgi

Reply via email to