Thanks for this conversation, it clarified some matters for me. Is there, then, a way to tell the App Engine Standard Environment to only apply the min_instance setting to the *default* app version, especially if 100% of traffic is now directed at that default version? I'd like to avoid the scenario of older app versions running unused instances.
Thanks, Boris On Monday, September 28, 2020 at 5:43:09 PM UTC-4 [email protected] wrote: > tl;dr: *Never use min_instances!* It will just increase your bill > unnecessarily. > > On Thursday, September 17, 2020 at 2:26:50 PM UTC-4 Olu wrote: > >> >> To start with, I can confirm that you would be billed for all Instances >> in use, whether or not they are actively serving requests, traffic or not. >> >> I will attempt to response to your inquiries as I have highlighted them >> below: >> >> 1. Is the information shared on this link[1] accurate? >> >> A: It is not exactly clear which part of the information you are looking >> to verify. However, I assume you are trying to confirm the explanation >> about min_instance and min_idle_instances. If so, yes, the information is >> accurate as those words were copied verbatim from the Documentation[2][3]. >> If not, please reply to this thread. >> > > Sorry, I guess I was referring to something implied by that link, not > directly stated, which is that setting min_instances to 1 or more will > result in instances never getting shut down in old versions, even if they > are not receiving traffic. > > > >> 2. if an auto-scaled service has min_instances set to nonzero, does that >> mean that instances in old versions don't get shut down when you deploy a >> new version? And those instances get billed? >> >> A: I believe this article[4] explains in detail how Instances are >> managed, particularly on scaling down. Scaling down Instances depend on the >> decrease in the request volumes. Typically, App Engine Standard environment >> scales down to 0[5] and as explained here[6], if the scheduler decides >> shuts down active instances due to lack of requests being handled, another >> instance will not start until prompted by an external request, even with >> the min_instance set. >> >> With all that being said, as explained in this documentation[7], the >> default behavior of App Engine Standard is that whenever a new application >> version is deployed, except the --no-promote flag is used in the >> deployment, the newly deployed version is automatically configured to >> receive 100% of traffic. So, with no traffic to the older version, the >> scheduler would shut down the instances due to lack of requests, even if >> the min_instance is set to nonzero. >> > > Obviously setting min_instances overrides the default behavior of scaling > down to zero. And I'm now convinced that, with min_instances set to > nonzero, it doesn't scale down to zero even in obsolete versions that are > set to receive no traffic. This isn't documented behavior, but I've seen it > implied elsewhere (like the Server Fault page above), and it was more or > less confirmed by the agent who handled my billing complaint (they checked > with support engineers, I believe). > > (As the documentation mentions, "For this feature to function properly, > you must make sure that warmup requests are enabled and that your > application handles warmup requests." So some users may have min_instances > set to more than zero, but not see the above problem, because it is not > actually configured correctly to maintain a minimum number of instances. I > made this mistake for a while.) > > > >> If you are experiencing a different behavior, I suggest you reach out >> directly to the GCP Support Engineers[8] for better evaluation of the >> issue. >> > > So apparently I have to pay for a support plan just to get information > that should be in the documentation... > > > >> 3. What is min_idle_instances actually supposed to mean? >> >> A: As explained in the documentation[3], this is the number of instances >> that keeps running and ready to serve traffic. The idle Instances helps to >> avoid the effect of pending latency on your App Engine application. >> > > I noticed that this documentation has recently been updated (maybe partly > in response to my complaints?). It now says "The number of *additional* > instances..." (emphasis mine), and goes on to explain that by "additional > instances", it means that App Engine calculates the "necessary" number of > instances to server current load, and adds on min_idle_instances more > instances. So it does *not* mean that there will always be this many > "idle" instances (for some definition of "idle"), as the name would imply. > The new documentation is a big improvement. (new version > <https://cloud.google.com/appengine/docs/standard/python3/config/appref#min_idle_instances> > > / old version > <https://web.archive.org/web/20200504082931if_/https://cloud.google.com/appengine/docs/standard/python3/config/appref> > ) > > (Still waiting for the min_instances documentation to be updated to warn > about the danger of zombie instances) > > > >> As I may have alluded above, Instances are created whenever requests are >> received. When instances are created, there are certain steps that apply >> for the Instance to start up and be ready to attend to requests. These are >> explained in these documentation[9][10]. Basically, having Idle instances >> help to avoid such steps that would cause pending latency. >> >> 4. Do I need to set max_idle_instances? >> >> A: No, you do not have need to set this parameter as it is Optional. >> Indeed, the default value of the max_idle_instances is automatic, which >> implies that the max is determined by the App Engine Autoscaler depending >> particularly on the number of requests being handled. >> > > Sorry, I was imprecise. I wasn't asking if it's required. I was asking if > I *should* set it, i.e, if I might get surprises in my bill if it's not > set, or anything. It is still not clear to me what it actually means, since > there's no clear definition of "idle" provided, and anyway because of the > previous confusion over min_idle_instances, I don't want to assume that it > has anything to do with instances that are "idle". The current > documentation implies that it has something to do with how rapidly > instances will be scaled down after a traffic peak, but doesn't give me any > way to quantitatively predict how rapidly it would scale down for a > particular value. Anyway I'm not setting this for now. > > For anyone reading this who's curious, this is my new production app.yaml > file: > > runtime: python37 > instance_class: F4 > automatic_scaling: > min_instances: 0 > min_idle_instances: 1 > max_instances: 10 > inbound_services: > - warmup > > With this configuration, there are always at least 2 instances of the > current version. We always have a minimum of 1 request per minute (from a > cron job); I assume it would probably scale down to 1 instance if we went a > sufficiently long time with no requests at all (I have no idea how long it > would take). Old versions do scale down to zero instances, though sometimes > it takes a while. For our integration and staging environments, we set > min_idle_instances to zero and max_instances to 2, and there is always at > least one instance (presumably would scale down to zero if given a chance). > > > -- You received this message because you are subscribed to the Google Groups "Google App Engine" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/google-appengine/888b825d-cb6d-4b65-8973-06ddbd248327n%40googlegroups.com.
