Hi Nigel et al, I recently noticed the load on our Koha server was getting ridiculously high and investigation showed that most of it was bot requests for opan-search.pl (averaging about one a second!) I have managed to stop the ones that were hurting us with robots.txt and I am fairly confident that amazobot does respect this. We haven't had any trouble from Facebook (so far)
Chris Brown On Thu, Jul 25, 2024 at 3:58 PM Coehoorn, Joel <jcoeho...@york.edu> wrote: > We've had a couple recent crashes I haven't yet had time to dig into. This > would explain it :/ > And as I look now, I also see a bunch of AmazonBot, but I haven't yet > checked whether this would at least respect robots.txt > > The really annoying thing about this is the catalog is there to be public. > It's why it exists. To that end we have the oai-pmh service available, > which would give them all the data they could reasonably expect in a much > more efficient way. > > *Joel Coehoorn* > Director of Information Technology > *York University* > Office: 402-363-5603 | jcoeho...@york.edu | york.edu > > > > On Thu, Jul 25, 2024 at 6:27 AM Nigel Titley <ni...@titley.com> wrote: > > > Is anyone else getting problems with the facebook web crawler hammering > > their OPAC search function? > > > > This has been happening on and off for a couple of months but set in > > with a vengeance a couple of days ago. The crawler is hitting us with > > many OPAC search queries, beyond the capacity of our system to respond. > > > > robots.txt is being ignored > > > > I started by blocking facebook's entire IPv6 range as the queries were > > all coming in over IPv6. They responded by switching to IPv4 and because > > they have a number of blocks it wasn't practical to block each and every > > one of them. > > > > I've temporarily switched off OPAC entirely and the system has returned > > to normal and I can at least perform intranet functions but this is > > obviously non-ideal. > > > > Does anyone have any thoughts on this? > > > > I'm running 22.05.13.000 on Ubuntu. > > > > Thanks > > > > Nigel > > _______________________________________________ > > > > Koha mailing list http://koha-community.org > > Koha@lists.katipo.co.nz > > Unsubscribe: https://lists.katipo.co.nz/mailman/listinfo/koha > > > _______________________________________________ > > Koha mailing list http://koha-community.org > Koha@lists.katipo.co.nz > Unsubscribe: https://lists.katipo.co.nz/mailman/listinfo/koha > _______________________________________________ Koha mailing list http://koha-community.org Koha@lists.katipo.co.nz Unsubscribe: https://lists.katipo.co.nz/mailman/listinfo/koha