True story

One day in December some year ago customers called in complaining about our 
service wasn't responding. A quick look in the console we could see that we 
had gone over our spending limit. Why? Digging deeper we could see that we 
had 1000+ instances running (standard, java, autoscale, and we hadn't 
specified maximum instances). After some trouble shooting and help with 
Google support we found the reason. It turned out that we had hit the roof 
of a socket _connect quota, for each request we get to our service we send 
a pub/sub message, in the pub/sub library we used it was not possible to 
batch send messages so each message sent was literally a socket_connect.

After hitting the socket _connect quota limit the calls to pub/sub where 
timing out, and the timeout value was too high (we were using the library 
default value), so the autoscaler scheduler started a new instance in a 
forever loop.

When this happen everything froze and we couldn't kill instances more 
rapidly than the scheduler spun up a new one, kind of a moment 22 situation.

In order to get it fixed we had to ask Google to bump the socket_connect 
quota, then we had to rise our spending limit to get everything working 
again. 

In a situation like this the only thing that wasn't getting us bankrupt was 
the spending limit,  the irony here the suggestion about using "Budget 
Alerts, Pub/Sub, and Cloud Functions when your costs exceed the threshold" 
won't work if your not allowed to call pub/sub because of a stupid 
socket_connect quota.

We probably stumbled on an edge case with a lot of combinations of usage of 
old libraries / apis, and hidden quota limits (I can't even find the 
socket_connect quota anymore on the quota pages in console), however my 
point here is spending limit is the last outpost we have to actually to set 
a cost limit and it's sad seeing it going away just as the other 
dismounting of what to be a great service.
 


 

onsdag 26 augusti 2020 kl. 22:47:44 UTC+2 skrev yananc:

> Hello Joshua,
>
> The budget alert will be triggered once your costs rise above the 
> threshold you specify. However, as Alexis has explained, there are various 
> factors that might affect ‘how quickly’ you will receive the alert.
>
> Same as the email you shared, the documentation [1] provides details on 
> how to manage App Engine costs. Specifically, besides mechanisms such as 
> setting the max number of instances, Budget Alerts, Pub/Sub, and Cloud 
> Functions could be used to automatically disable your app when your costs 
> exceed the threshold you specify. The documentation also provides steps on 
> how to implement it with sample codes.
>
> Back to the issue of ‘Cloud Datastore Read Operations’ being too high, a 
> possible solution is to leverage cache mechanism to avoid excessive 
> operations. You may find more information from the topic [2].
>
> Hope it helps.
>
> [1]: https://cloud.google.com/appengine/docs/managing-costs
>
> [2]: 
> https://stackoverflow.com/questions/12939376/google-app-engine-too-many-datastore-read-operations
>
> On Tuesday, August 25, 2020 at 6:51:49 PM UTC-4 Luca de Alfaro wrote:
>
>> Yes, at least, we can hard limit the number of active instances: see 
>> https://cloud.google.com/appengine/docs/standard/python3/config/appref
>> So if every active instance has a limited rate of use of backend services 
>> (like datastore), and there are no services accessible except via an 
>> appengine instance (e.g., no GCS direct bandwidth), in practice we can put 
>> a bound using that. 
>>
>> Luca
>>
>> On Tue, Aug 25, 2020 at 3:45 PM Luca de Alfaro <[email protected]> wrote:
>>
>>> I concur with the worry.  Is there any _technical_ reason why it is a 
>>> good idea to do away with a spending limit?  Can we get an instance limit 
>>> instead? 
>>> This is suddenly making standard non-scalable systems on AWS look better 
>>> than appengine! 
>>>
>>> On Tue, Aug 25, 2020 at 9:03 AM Joshua Smith <[email protected]> 
>>> wrote:
>>>
>>>> Once again last night, my wallet was saved when a runaway bot chewed up 
>>>> my site’s whole daily spending limit. I got an email from a user, set up a 
>>>> firewall rule, and goosed my budget to get things going again.
>>>>
>>>> I’m *very* concerned about Google’s decision to remove this feature. 
>>>> Offering a cloud service that bills by usage without having a way to limit 
>>>> the spend shifts an unreasonable amount of risk onto the subscriber.
>>>>
>>>> I’ve set up budget alerts, as suggested, but I’m concerned that:
>>>>
>>>> - What if my bill shoots up really fast? How quickly is this alert 
>>>> going to go out?
>>>>
>>>> - What if I am away from the computer (remember when we used to be able 
>>>> to leave our houses? good times… good times…)?
>>>>
>>>> I run this particular site as a not-for-profit social good. (It’s a 
>>>> site that small town governments use to post their meetings.) I make 
>>>> *no* money on it.
>>>>
>>>> I’d be perfectly happy to handle this with self-set quotas on something 
>>>> other than dollars. For example, in my case the budget-buster is always 
>>>> “Cloud Datastore Read Operations.” If I could set a cap on that one thing, 
>>>> it’d give me the protection I need.
>>>>
>>>> -Joshua
>>>>
>>>> -- 
>>>> You received this message because you are subscribed to the Google 
>>>> Groups "Google App Engine" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send 
>>>> an email to [email protected].
>>>> To view this discussion on the web visit 
>>>> https://groups.google.com/d/msgid/google-appengine/FC0F0C74-0D40-48DF-8919-208202A9B1A8%40gmail.com
>>>>  
>>>> <https://groups.google.com/d/msgid/google-appengine/FC0F0C74-0D40-48DF-8919-208202A9B1A8%40gmail.com?utm_medium=email&utm_source=footer>
>>>> .
>>>>
>>>

-- 
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/google-appengine/8f43441f-d0dd-4bea-b63f-71f06f64c54cn%40googlegroups.com.
  • [google-app... Joshua Smith
    • [googl... 'Alexis (Google Cloud Platform Support)' via Google App Engine
    • Re: [g... Luca de Alfaro
      • Re... Luca de Alfaro
        • ... 'yananc' via Google App Engine
        • ... 'yananc' via Google App Engine
          • ... Linus Larsen
          • ... Joshua Smith
            • ... 'yananc' via Google App Engine
    • [googl... Vitaly Bogomolov
      • [g... 'Alexis (Google Cloud Platform Support)' via Google App Engine
        • ... Jukka Hautakorpi
          • ... 'Olu' via Google App Engine

Reply via email to