On this particular topic of Cloud Datastore Read Operations being the cost driver, unfortunately caching doesn’t help my case.
This site has about 60,000 individual meetings listed, and I like having *useful* *well-behaved* crawlers find all those meetings so people can use search engines to find meetings that happened in various towns. But what keeps happening is some new useless/poorly-behaved crawler will decide to read all 60,000 of those meetings as fast as it possibly can. (Ignoring the crawl-delay.) Caching in that situation would just add *more* cost, since there are no repeat hits from a crawler. I periodically look at my access logs, find peaks, look at the requests, and figure out what new bot needs to be added to my robots.txt disallow list. And sometimes I have to add a firewall rule because that bot isn’t obeying robots.txt at all. -Joshua > On Aug 26, 2020, at 4:47 PM, 'yananc' via Google App Engine > <[email protected]> wrote: > > Back to the issue of ‘Cloud Datastore Read Operations’ being too high, a > possible solution is to leverage cache mechanism to avoid excessive > operations. You may find more information from the topic [2]. -- You received this message because you are subscribed to the Google Groups "Google App Engine" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/google-appengine/068BD581-9C44-4E59-A235-61DD0239FFBA%40gmail.com.
