Re: [Koha-devel] Web crawlers hammered our Koha multi server

2010-01-13 Thread Rick Welykochy
Owen Leonard wrote: >> Addendum: also install a robots.txt file at the following location >> in the Koha source tree: >> >> opac/htdocs/robots.txt > > Isn't this already a part of a standard Koha installation, and even if > not, isn't this all that is required to ward off search engine > spide

Re: [Koha-devel] Web crawlers hammered our Koha multi server

2010-01-13 Thread Owen Leonard
> Addendum: also install a robots.txt file at the following location > in the Koha source tree: > >    opac/htdocs/robots.txt Isn't this already a part of a standard Koha installation, and even if not, isn't this all that is required to ward off search engine spiders? Killing by default the abili

Re: [Koha-devel] Web crawlers hammered our Koha multi server

2010-01-13 Thread Rick Welykochy
Mason James wrote: > i'm curious... what was the bot's ID-string in your access.log? > (i want to check my logs for those bots too) public knowledge: the USER_AGENT is "Mozilla/5.0 (Twiceler-0.9 http://www.cuil.com/twiceler/robot.html)" cheer rick -- _ Rick

Re: [Koha-devel] Web crawlers hammered our Koha multi server

2010-01-13 Thread Mason James
> > The real problem lies in the nature of bots of any species that find > a form to fill and and hit your website with all possible values being > selected, one by one. > > This is the behaviour I saw in our Apache logs. EVERY possibility > for the > advanced search was being requested and presu

Re: [Koha-devel] Web crawlers hammered our Koha multi server

2010-01-12 Thread Rick Welykochy
Chris Cormack wrote: > This of course should be an option. There are many libraries who would > like their catalogue indexed by search engines and if the server has > the capacity to do it, it should be allowed. > So whatever changes made to opac-search.pl should be under the control > of a system

Re: [Koha-devel] Web crawlers hammered our Koha multi server

2010-01-12 Thread Chris Cormack
Hi RIck This of course should be an option. There are many libraries who would like their catalogue indexed by search engines and if the server has the capacity to do it, it should be allowed. So whatever changes made to opac-search.pl should be under the control of a systempreference. Also this

[Koha-devel] Web crawlers hammered our Koha multi server

2010-01-12 Thread Rick Welykochy
Hi all We are running several instances of Koha on the one box using Linux Vserver. The other night the server was brought to its knees and mysql ran out of free connections. Further investigation found over 80 instances of perl + Apache running OPAC search queries. There were many attendant inst