Note that I've had similar-looking issues <https://groups.google.com/g/google-appengine/c/eJUQ7NlNkso/m/3xav-lCEBAAJ> in the past, though that was under very different conditions and it appeared to get fixed.
On Friday, October 2, 2020 at 11:54:49 AM UTC-4 Alan deLespinasse wrote: > Not sure what you mean by the default version. There's a default > *service*, but generally you have a different configuration file for each > service, so it's easy enough to have a different value of min_instances for > each one. > > My best advice for now is not to use min_instances; use min_idle_instances > instead. Unfortunately settings min_idle_instances can result in a larger > number of running instances than setting min_instances to the same number. > For example, if min_idle_instances is 2, and you have a pretty low level of > traffic that could easily be handled by one instance, then you'll have 3 > instances. Whereas if min_instances were 2, you'd have only 2 instances > under the same conditions. > > The other option is probably some kind of scripting in your release > process to delete instances of old versions, or even to delete all old > versions (which I wouldn't want to do, because having old versions still > around is very useful if you need to do a quick rollback). I'm not really > sure what's possible. > > On Friday, October 2, 2020 at 11:35:20 AM UTC-4 Boris Brudnoy wrote: > >> Thanks for this conversation, it clarified some matters for me. >> >> Is there, then, a way to tell the App Engine Standard Environment to only >> apply the min_instance setting to the *default* app version, especially >> if 100% of traffic is now directed at that default version? I'd like to >> avoid the scenario of older app versions running unused instances. >> >> Thanks, >> >> Boris >> >> On Monday, September 28, 2020 at 5:43:09 PM UTC-4 [email protected] >> wrote: >> >>> tl;dr: *Never use min_instances!* It will just increase your bill >>> unnecessarily. >>> >>> On Thursday, September 17, 2020 at 2:26:50 PM UTC-4 Olu wrote: >>> >>>> >>>> To start with, I can confirm that you would be billed for all Instances >>>> in use, whether or not they are actively serving requests, traffic or not. >>>> >>>> I will attempt to response to your inquiries as I have highlighted them >>>> below: >>>> >>>> 1. Is the information shared on this link[1] accurate? >>>> >>>> A: It is not exactly clear which part of the information you are >>>> looking to verify. However, I assume you are trying to confirm the >>>> explanation about min_instance and min_idle_instances. If so, yes, the >>>> information is accurate as those words were copied verbatim from the >>>> Documentation[2][3]. If not, please reply to this thread. >>>> >>> >>> Sorry, I guess I was referring to something implied by that link, not >>> directly stated, which is that setting min_instances to 1 or more will >>> result in instances never getting shut down in old versions, even if they >>> are not receiving traffic. >>> >>> >>> >>>> 2. if an auto-scaled service has min_instances set to nonzero, does >>>> that mean that instances in old versions don't get shut down when you >>>> deploy a new version? And those instances get billed? >>>> >>>> A: I believe this article[4] explains in detail how Instances are >>>> managed, particularly on scaling down. Scaling down Instances depend on >>>> the >>>> decrease in the request volumes. Typically, App Engine Standard >>>> environment >>>> scales down to 0[5] and as explained here[6], if the scheduler decides >>>> shuts down active instances due to lack of requests being handled, another >>>> instance will not start until prompted by an external request, even with >>>> the min_instance set. >>>> >>>> With all that being said, as explained in this documentation[7], the >>>> default behavior of App Engine Standard is that whenever a new application >>>> version is deployed, except the --no-promote flag is used in the >>>> deployment, the newly deployed version is automatically configured to >>>> receive 100% of traffic. So, with no traffic to the older version, the >>>> scheduler would shut down the instances due to lack of requests, even if >>>> the min_instance is set to nonzero. >>>> >>> >>> Obviously setting min_instances overrides the default behavior of >>> scaling down to zero. And I'm now convinced that, with min_instances set to >>> nonzero, it doesn't scale down to zero even in obsolete versions that are >>> set to receive no traffic. This isn't documented behavior, but I've seen it >>> implied elsewhere (like the Server Fault page above), and it was more or >>> less confirmed by the agent who handled my billing complaint (they checked >>> with support engineers, I believe). >>> >>> (As the documentation mentions, "For this feature to function properly, >>> you must make sure that warmup requests are enabled and that your >>> application handles warmup requests." So some users may have min_instances >>> set to more than zero, but not see the above problem, because it is not >>> actually configured correctly to maintain a minimum number of instances. I >>> made this mistake for a while.) >>> >>> >>> >>>> If you are experiencing a different behavior, I suggest you reach out >>>> directly to the GCP Support Engineers[8] for better evaluation of the >>>> issue. >>>> >>> >>> So apparently I have to pay for a support plan just to get information >>> that should be in the documentation... >>> >>> >>> >>>> 3. What is min_idle_instances actually supposed to mean? >>>> >>>> A: As explained in the documentation[3], this is the number of >>>> instances that keeps running and ready to serve traffic. The idle >>>> Instances >>>> helps to avoid the effect of pending latency on your App Engine >>>> application. >>>> >>> >>> I noticed that this documentation has recently been updated (maybe >>> partly in response to my complaints?). It now says "The number of >>> *additional* instances..." (emphasis mine), and goes on to explain that >>> by "additional instances", it means that App Engine calculates the >>> "necessary" number of instances to server current load, and adds on >>> min_idle_instances more instances. So it does *not* mean that there >>> will always be this many "idle" instances (for some definition of "idle"), >>> as the name would imply. The new documentation is a big improvement. (new >>> version >>> <https://cloud.google.com/appengine/docs/standard/python3/config/appref#min_idle_instances> >>> >>> / old version >>> <https://web.archive.org/web/20200504082931if_/https://cloud.google.com/appengine/docs/standard/python3/config/appref> >>> ) >>> >>> (Still waiting for the min_instances documentation to be updated to warn >>> about the danger of zombie instances) >>> >>> >>> >>>> As I may have alluded above, Instances are created whenever requests >>>> are received. When instances are created, there are certain steps that >>>> apply for the Instance to start up and be ready to attend to requests. >>>> These are explained in these documentation[9][10]. Basically, having Idle >>>> instances help to avoid such steps that would cause pending latency. >>>> >>>> 4. Do I need to set max_idle_instances? >>>> >>>> A: No, you do not have need to set this parameter as it is Optional. >>>> Indeed, the default value of the max_idle_instances is automatic, which >>>> implies that the max is determined by the App Engine Autoscaler depending >>>> particularly on the number of requests being handled. >>>> >>> >>> Sorry, I was imprecise. I wasn't asking if it's required. I was asking >>> if I *should* set it, i.e, if I might get surprises in my bill if it's >>> not set, or anything. It is still not clear to me what it actually means, >>> since there's no clear definition of "idle" provided, and anyway because of >>> the previous confusion over min_idle_instances, I don't want to assume that >>> it has anything to do with instances that are "idle". The current >>> documentation implies that it has something to do with how rapidly >>> instances will be scaled down after a traffic peak, but doesn't give me any >>> way to quantitatively predict how rapidly it would scale down for a >>> particular value. Anyway I'm not setting this for now. >>> >>> For anyone reading this who's curious, this is my new production >>> app.yaml file: >>> >>> runtime: python37 >>> instance_class: F4 >>> automatic_scaling: >>> min_instances: 0 >>> min_idle_instances: 1 >>> max_instances: 10 >>> inbound_services: >>> - warmup >>> >>> With this configuration, there are always at least 2 instances of the >>> current version. We always have a minimum of 1 request per minute (from a >>> cron job); I assume it would probably scale down to 1 instance if we went a >>> sufficiently long time with no requests at all (I have no idea how long it >>> would take). Old versions do scale down to zero instances, though sometimes >>> it takes a while. For our integration and staging environments, we set >>> min_idle_instances to zero and max_instances to 2, and there is always at >>> least one instance (presumably would scale down to zero if given a chance). >>> >>> >>> -- You received this message because you are subscribed to the Google Groups "Google App Engine" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/google-appengine/1589fe8b-fb20-44b3-ae54-fac9277429c8n%40googlegroups.com.
