To start with, I can confirm that you would be billed for all Instances in 
use, whether or not they are actively serving requests, traffic or not. 

I will attempt to response to your inquiries as I have highlighted them 
below:

1. Is the information shared on this link[1] accurate?

A: It is not exactly clear which part of the information you are looking to 
verify. However, I assume you are trying to confirm the explanation about 
min_instance and min_idle_instances. If so, yes, the information is 
accurate as those words were copied verbatim from the Documentation[2][3]. 
If not, please reply to this thread. 

2. if an auto-scaled service has min_instances set to nonzero, does that 
mean that instances in old versions don't get shut down when you deploy a 
new version? And those instances get billed?

A: I believe this article[4] explains in detail how Instances are managed, 
particularly on scaling down. Scaling down Instances depend on the decrease 
in the request volumes. Typically, App Engine Standard environment scales 
down to 0[5] and as explained here[6], if the scheduler decides shuts down 
active instances due to lack of requests being handled, another instance 
will not start until prompted by an external request, even with the 
min_instance set.

With all that being said, as explained in this documentation[7], the 
default behavior of App Engine Standard is that whenever a new application 
version is deployed, except the --no-promote flag is used in the 
deployment, the newly deployed version is automatically configured to 
receive 100% of traffic. So, with no traffic to the older version, the 
scheduler would shut down the instances due to lack of requests, even if 
the min_instance is set to nonzero.

If you are experiencing a different behavior, I suggest you reach out 
directly to the GCP Support Engineers[8] for better evaluation of the 
issue. 

3. What is min_idle_instances actually supposed to mean?

A: As explained in the documentation[3], this is the number of instances 
that keeps running and ready to serve traffic. The idle Instances helps to 
avoid the effect of pending latency on your App Engine application.

As I may have alluded above, Instances are created whenever requests are 
received. When instances are created, there are certain steps that apply 
for the Instance to start up and be ready to attend to requests. These are 
explained in these documentation[9][10]. Basically, having Idle instances 
help to avoid such steps that would cause pending latency.

4.  Do I need to set max_idle_instances? 

A: No, you do not have need to set this parameter as it is Optional. 
Indeed, the default value of the  max_idle_instances is automatic, which 
implies that the max is determined by the App Engine Autoscaler depending 
particularly on the number of requests being handled. 

Not to overwhelm you with a lot of information, I think you can find the 
details that you require in my response. If not, please be sure to reply 
with more inquiries.


[1]https://serverfault.com/questions/999892/app-engine-standard-auto-scaling-how-to-stop-previous-version-on-deployment/999900#999900
[2]https://cloud.google.com/appengine/docs/standard/python3/config/appref#automatic_scaling_min_instances
[3]https://cloud.google.com/appengine/docs/standard/python3/config/appref#min_idle_instances
[4]https://cloud.google.com/appengine/docs/standard/python/how-instances-are-managed#scaling_down
[5]https://stackoverflow.com/questions/51272392/how-to-scale-down-to-0-instances-in-gae-standard-go#answer-51291372
[6]https://issuetracker.google.com/162502284#comment2
[7]https://cloud.google.com/appengine/docs/standard/python/tools/uploadinganapp#deploying_an_app
[8][1]https://cloud.google.com/support-hub
[9]https://cloud.google.com/appengine/docs/standard/python/how-instances-are-managed#startup
[10]https://cloud.google.com/appengine/docs/standard/python/how-instances-are-managed#loading_requests
On Thursday, September 10, 2020 at 11:34:10 PM UTC-4 [email protected] 
wrote:

> First question: Is this accurate <https://serverfault.com/a/999900>? That 
> is, if an auto-scaled service has min_instances set to nonzero, does that 
> mean that instances in old versions don't get shut down when you deploy a 
> new version? And those instances get billed?
>
> I've been running a service with the following configuration (standard 
> environment, Python 3.7):
>
> runtime: python37
> instance_class: F4
> automatic_scaling:
>   min_instances: 1
>   max_instances: 10
> inbound_services:
> - warmup
>
> New versions are deployed frequently because it's our integration 
> environment. Apparently we've ended up with more and more instances 
> running, because when we deploy a new version, the old version continues to 
> exist and have 1 running instance. And apparently we're getting billed for 
> these. I don't think I should have expected this, based on a close reading 
> of the documentation 
> <https://cloud.google.com/appengine/docs/standard/python3/config/appref#scaling_elements>.
>  
> (I have opened a billing support request, since I think it's Google's error 
> in documentation, if not actually a bug.)
>
> So now I'm trying to fix the configuration to avoid this. Based on the 
> Server Fault article I linked above, I tried setting min_instances to 0 and 
> min_idle_instances to 1. This seems to result in always at least 2 
> instances running. I think maybe because one instance is getting requests 
> (we have a once-a-minute cron job, among other things), so it's not "idle", 
> so there has to be one more instance to have a minimum of one idle instance.
>
> So I tried setting both min_instances and min_idle_instances to 0, but I 
> *still* seem to always have at least 2 instances.
>
> It's really hard to tell though, because the GCP console sometimes takes a 
> bit to update, and maybe sometimes there are actually more instances than 
> the configuration requires (I think maybe they're not always billed?).
>
> So, second question: What is min_idle_instances actually supposed to mean? 
> Is it the minimum number of instances, or is it the minimum number of 
> "idle" instances, for some definition of "idle"? If an instance is serving 
> 1 simple query per minute, does that mean it's not idle, so there will be a 
> second, idle instance? But then there are 2 instances, and if there are a 
> few simple queries happening per minute, it seems like some might be 
> randomly routed to each instance, so neither instance would be idle, and a 
> third instance would get started.
>
> Another complication: in our production environment, I increased 
> min_instances to 2 because of this issue 
> <https://groups.google.com/g/google-appengine/c/SeVMpquMdnQ/m/tEr9prqLAQAJ>. 
> It's pretty important that we always (>99.9% anyway) have at least 1 
> running instance, since apparently there's no way to get instance startup 
> time to less than 20-30 seconds, and to guarantee that, apparently we need 
> 2 instances running most of the time, because they can get preempted at any 
> time without warming up a replacement first. So now I'm not sure whether to 
> set min_idle_instances to 1 or 2 or what in production.
>
> Do I need to set max_idle_instances? The documentation says its default 
> value is "automatic", but doesn't say what that actually means.
>
> I'm having a hard time figuring out these issues based on the 
> documentation.
>
> All I want is
>
>    - Instances get shut down when I deploy a new version (I would have 
>    thought this was always the case no matter what!)
>    - Each of my environments normally just has 1 instance running, 
>    assuming light traffic (1-10 queries per minute). Having 2 instances 
> always 
>    running in production is ok, if that's the only way to achieve the next 
>    point:
>    - Never (or almost never) have zero instances running (aka almost 
>    never have a query take more than 2 seconds because of warmup time)
>    - Autoscale up to a reasonable maximum if traffic gets heavier
>
> I didn't think this would be so difficult. I've been using App Engine for 
> a long time, and thought I knew what I was doing, but I guess I've never 
> used these options in the current environment (Python 3, standard).
>

-- 
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/google-appengine/bab3f439-8106-496b-ae7d-f482cc15e009n%40googlegroups.com.

Reply via email to