Re: [google-appengine] Spending LImits Going Away :(

Joshua Smith Thu, 27 Aug 2020 08:32:29 -0700

On this particular topic of Cloud Datastore Read Operations being the cost 
driver, unfortunately caching doesn’t help my case.

This site has about 60,000 individual meetings listed, and I like having 
*useful* *well-behaved* crawlers find all those meetings so people can use 
search engines to find meetings that happened in various towns. But what keeps 
happening is some new useless/poorly-behaved crawler will decide to read all 
60,000 of those meetings as fast as it possibly can. (Ignoring the 
crawl-delay.) Caching in that situation would just add *more* cost, since there 
are no repeat hits from a crawler.

I periodically look at my access logs, find peaks, look at the requests, and 
figure out what new bot needs to be added to my robots.txt disallow list. And 
sometimes I have to add a firewall rule because that bot isn’t obeying 
robots.txt at all.

-Joshua

> On Aug 26, 2020, at 4:47 PM, 'yananc' via Google App Engine 
> <[email protected]> wrote:
> 
> Back to the issue of ‘Cloud Datastore Read Operations’ being too high, a 
> possible solution is to leverage cache mechanism to avoid excessive 
> operations. You may find more information from the topic [2].

-- 
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/google-appengine/068BD581-9C44-4E59-A235-61DD0239FFBA%40gmail.com.

Re: [google-appengine] Spending LImits Going Away :(

Reply via email to