On Sunday 03 November 2019 12:11:16 john doe wrote: > On 11/3/2019 5:32 PM, Gene Heskett wrote: > > On Sunday 03 November 2019 10:34:09 john doe wrote: > >> On 11/3/2019 4:04 PM, Gene Heskett wrote: > >>> Greetings all > >>> > >>> I am developing a list of broken webcrawlers who are repeatedly > >>> downloading my entire web site including the hidden stuff. > >>> > >>> These crawlers/bots are ignoring my robots.txt files and aren't > >>> just indexing the site, but are downloading every single bit of > >>> every file there. > >>> > >>> This is burning up my upload bandwidth and constitutes a DDOS when > >>> 4 or 5 bots all go into this pull it all mode at the same time. > >>> > >>> How do I best deal with these poorly written bots? I can target > >>> the individual address of course, but have chosen to block the > >>> /24, but that seems not to bother them for more than 30 minutes. > >>> Its also a too broad brush, blocking legit addresses access. > >>> Restarting apache2 also work, for half an hour or so, but I may be > >>> interrupting a legit request for a realtime kernel whose built > >>> tree is around 2.7GB in tgz format > >>> > >>> How do I get their attention to stop the DDOS? Or is this a war > >>> you cannot win? > >> > >> 'fail2ban' for the bots that does not respect robot.txt. > > > > Wasn't installed by this stretch version. Is now, reading man > > page's. Frankly this looks dangerous when attempted to be run as > > beginning users. There ought to be a startup tutorial based on > > setting up the logging, then specifying who you want blocked from > > reading the logs. Is there a formal tut of setting this up > > someplace? > > Those are more hints then an howto: > > https://askubuntu.com/questions/1116001/block-badbot-with-fail2ban-via >-user-agents-in-access-log > https://www.booleanworld.com/blocking-bad-bots-fail2ban/ > > Or with Iptables: > https://blog.nintechnet.com/how-to-block-w00tw00t-at-isc-sans-dfind-an >d-other-web-vulnerability-scanners/ > https://javapipe.com/blog/iptables-ddos-protection/ > > > I guess I would impliment both approaches. > > > Does your website realy need to be available to the world? > Can't you consider an VPS with anti-DDoS capability? > Wouldn't have the foggiest how to set that up. And/or setting up a login/password. What I have there is of very little interest to folks not running an rpi3b or rpi4b, or a trs-80 Color Computer.
What advantage would the vps offer? And likely not have time to setup as I'm scheduled for a new aortic valve to be installed Tuesday. Mine is 85 yo and about wore out. Pumping efficiency is about 30% due to leakage. > > HTH. > > -- > John Doe Cheers, Gene Heskett -- "There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order." -Ed Howdershelt (Author) If we desire respect for the law, we must first make the law respectable. - Louis D. Brandeis Genes Web page <http://geneslinuxbox.net:6309/gene>