Re: [MediaWiki-l] Keeping Google from crawling red links to non-existent pages

Max Semenik Fri, 13 Mar 2015 09:59:45 -0700

If you are using the Wikipedia URL scheme (/wiki/Page_title and
/w/index.php?title=Page_title&foo=bar), you can just ban bots from /w/ in
robots.txt.


On Fri, Mar 13, 2015 at 7:26 AM, Al <[email protected]> wrote:

> Hi,
> In my Google Webmaster Tools account, there are a lot of crawl 404 errors
> for non-existent pages.  It appears that it will request a page that does
> not exist (with &redlink=1), get a 200 status, of course, and the "create"
> page, and then somehow they derive a link to the same url but without the
> &redlink or &edit parameters (probably from the menu links on the page),
> which they then try to crawl and receive 404.
>
> Does anyone know how to deal with this so that the google crawler do
> this?
> It looks like google first discovers the redlinks mostly from a previous
> spam page which was subsequently deleted. But, once google sees it the
> first time, then remember it for quite some time.
>
> Thanks,Al
>
> _______________________________________________
> MediaWiki-l mailing list
> To unsubscribe, go to:
> https://lists.wikimedia.org/mailman/listinfo/mediawiki-l
>



-- 
Best regards,
Max Semenik ([[User:MaxSem]])
_______________________________________________
MediaWiki-l mailing list
To unsubscribe, go to:
https://lists.wikimedia.org/mailman/listinfo/mediawiki-l

Re: [MediaWiki-l] Keeping Google from crawling red links to non-existent pages

Reply via email to