If you are using the Wikipedia URL scheme (/wiki/Page_title and /w/index.php?title=Page_title&foo=bar), you can just ban bots from /w/ in robots.txt.
On Fri, Mar 13, 2015 at 7:26 AM, Al <[email protected]> wrote: > Hi, > In my Google Webmaster Tools account, there are a lot of crawl 404 errors > for non-existent pages. It appears that it will request a page that does > not exist (with &redlink=1), get a 200 status, of course, and the "create" > page, and then somehow they derive a link to the same url but without the > &redlink or &edit parameters (probably from the menu links on the page), > which they then try to crawl and receive 404. > > Does anyone know how to deal with this so that the google crawler do > this? > It looks like google first discovers the redlinks mostly from a previous > spam page which was subsequently deleted. But, once google sees it the > first time, then remember it for quite some time. > > Thanks,Al > > _______________________________________________ > MediaWiki-l mailing list > To unsubscribe, go to: > https://lists.wikimedia.org/mailman/listinfo/mediawiki-l > -- Best regards, Max Semenik ([[User:MaxSem]]) _______________________________________________ MediaWiki-l mailing list To unsubscribe, go to: https://lists.wikimedia.org/mailman/listinfo/mediawiki-l
