Hello Joshua, 

You have done the right thing with robots.txt. App Engine firewall 
<https://cloud.google.com/appengine/docs/standard/python/application-security#app_engine_firewall>
 
offers these 3 options: Allow only traffic from within a specific 
network, Allow only traffic from a specific service, and Block abusive IP 
addresses. You have considered these in your post, so there is no other 
option, except based on eventual other subscriptions. 

On Tuesday, 25 January 2022 at 10:23:01 UTC-5 Joshua Smith wrote:

> I have a small site that I run on a not-for-profit basis.
>
> Periodically I need to update robots.txt or add firewall rules to shut 
> down bad actors who beat the crap out of the site running up my instance 
> costs.
>
> Lately, I've been getting slammed by instances running on AWS. They are 
> mostly making HEAD requests, which makes me think it's some kind of 
> crawler, but it uses regular browser user agents and doesn't respect my 
> robots rules.
>
> There's no legitimate reason for AWS to browse my site, so I just add a 
> firewall rule, right? Trouble is, AWS has 6,462 different IPV4 address 
> ranges, and this crawler is constantly jumping between them.
>
> Any costs that I have are paid out of my own pocket. So I'm looking for 
> suggestions that don't require MORE subscriptions (like CloudFlare or 
> something).
>
> Any ideas?
>
> -Joshua
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/google-appengine/ff8c1ffd-1579-4059-8cc7-30b39e32022cn%40googlegroups.com.
  • [google-appengine]... Joshua Smith
    • [google-appen... 'George (Cloud Platform Support)' via Google App Engine

Reply via email to