This new 15 minute description is making it more difficult to understand, but it seems to be the same system as previously. An instance will die if there is no traffic for 15 minutes. You are charged for these idle 15 minutes. It's that simple.
On Jun 24, 5:12 pm, Barry Hunter <[email protected]> wrote: > On Fri, Jun 24, 2011 at 8:47 AM, Vinuth Madinur > > <[email protected]> wrote: > > Disappointed that developers have to start tinkering with the underlying > > scheduler. > > And the 15-minute "startup" fee is worse than the earlier 15 minute > > "minimum" granularity. If your traffic is bursty(an instance is required for > > less than 15minutes), you always pay more than twice the amount of resources > > you used. > > Has that actually changed? Its the same thing, just stated in a > slightly different way. > > You are not always changed the 15 minutes 'fee'. In the example in the > FAQ, only changed 4 minutes for the intermediate gap, not a full 15 > minutes. > > > > > > > > > "Usage Based Pricing"? > > And no intermediate plan without an SLA. > > Disappointed. > > > Just to clarify: does "half-instance" for Python mean 48 half-instance hours > > under the free quota? > > > ~Vinuth. > > > On Fri, Jun 24, 2011 at 12:19 PM, Gregory D'alesandre <[email protected]> > > wrote: > > >> Hello All!, Well, it took longer than expected, but here is the updated > >> FAQ! I highlighted the new sections, it covers how always-on will work, > >> full explanation of datastore API prices, description of the new scheduler > >> knobs, and description of what is needed to prepare for Python 2.7 and > >> concurrent requests. I hope this helps clarify some of the bigger > >> questions > >> people had and as always please let me know if you have additional > >> questions. Thanks! > >> Greg D'Alesandre > >> Senior Product Manager, Google App Engine > >> ------ > > >> Post-Preview Pricing FAQ > > >> When Google App Engine leaves Preview in the second half of 2011, the > >> pricing will change. Details are listed here: > >>http://www.google.com/enterprise/appengine/appengine_pricing.html. This FAQ > >> is intended to help answer some of the frequently asked questions about the > >> new model. > > >> Definitions > > >> Instance: A small virtual environment to run your code with a reserved > >> amount of CPU and Memory. > >> Frontend Instance: An Instance running your code and scaling dynamically > >> based on the incoming requests but limited in how long a request can run. > >> Backend Instance: An Instance running your code with limited scaling based > >> on your settings and potentially starting and stopping based on your > >> actions. > >> Pending Request Queue: Where new requests wait when there are no instances > >> available to serve the request > >> Pending Latency: The amount of time a request has been waiting in the > >> Request Queue > >> Scheduler: Part of the App Engine infrastructure that determines which > >> Instance should serve a request including whether or not a new Instance is > >> needed. > > >> Serving Infrastructure > > >> Q: What’s an Instance? > >> A: When App Engine starts running your code it creates a small virtual > >> environment to run your code with a reserved amount of CPU and Memory. For > >> example if you are running a Java app, we will start a new JVM for you and > >> load your code into it. > > >> Q: Is an App Engine Instance similar to a VM from infrastructure > >> providers? > >> A: Yes and no, they both have a set amount of CPU and Memory allocated to > >> them, but GAE instances don’t have the overhead of operating systems or > >> other applications running, so a much larger percentage of the CPU and > >> memory is considered “usable.” They also operate against high-level APIs > >> and > >> not down through layers of code to virtual device drivers, so it’s more > >> efficient, and allows all the services to be fully managed. > > >> Q: How does GAE determine the number of Frontend Instances to run? > >> A: For each new request, the Scheduler decides whether there is an > >> available Instance for the request, the request should wait, or a new > >> Instance should be created to service the request. It looks at the number > >> of Instances, the throughput of the Instances, and the number of requests > >> waiting. Based on that it predicts how long it will take before it can > >> serve the request (aka the Predicted Pending Latency). If the Predicted > >> Pending Latency looks too long, a new instance may be created. If it looks > >> like an Instance is no longer needed, it will take that Instance down. > > >> Q: Should I assume I will be charged for the number of Instances currently > >> being shown in the Admin console? > >> A: No, we are working to change the Scheduler to optimize the utilization > >> of instances, so that number should go down somewhat. If you are using > >> Java, you can also make your app threadsafe and take advantage of handling > >> concurrent requests. You can look at the current number of running > >> Instances as an upper bound on how many Instances you will be charged for. > > >> Q: How can I control the number of instances running? > >> A: The Scheduler determines how many instances should run for your > >> application. With the new Scheduler you’ll have the ability to choose a > >> set > >> of parameters that will help you specify how many instances are spun up to > >> serve your traffic. More information about the specific parameters can be > >> found below under “What adjustments will be available for the new > >> scheduler?” > > >> Q: What can I control in terms of how many requests an Instance can > >> handle? > >> A: The single largest factor is your application’s latency in handling the > >> request. If you service requests quickly, a single instance can handle a > >> lot of requests. Also, Java apps support concurrent requests, so it can > >> handle additional requests while waiting for other requests to complete. > >> This can significantly lower the number of Instances your app requires. > > >> Q: Will there be a solution for Python concurrency? Will this require any > >> code changes? > >> Python concurrency will be handled by our release of Python 2.7 on App > >> Engine. We’ve heard a lot of feedback from our Python users who are > >> worried > >> that the incentive is to move to Java because of its support for concurrent > >> requests, so we’ve made a change to the new pricing to account for that. > >> While Python 2.7 support is currently in progress it is not yet done so we > >> will be providing a half-sized instance for Python (at half the price) > >> until > >> Python 2.7 is released. See “What code changes will I need to make in > >> order > >> to use Python 2.7?” below for more information. > > >> Q: How many requests can an average instance handle? > >> A: Single-threaded Instances (Python or Java) can currently handle 1 > >> concurrent request. Therefore there is a direct relationship between the > >> latency and number of requests which can be handled on the instance per > >> second, for instance: 10ms latency = 100 request/second/Instance, 100ms > >> latency = 10 request/second/Instance, etc. Multi-Threaded Instances can > >> handle many concurrent requests. Therefore there is a direct relationship > >> between the CPU consumed and the number of requests/second. For instance, > >> for a B4 backend instance (approx 2.4GHz): consuming 10 Mcycles/request = > >> 240 request/second/Instance, 100 Mcycles/request = 24 > >> request/second/Instance, etc. These numbers are the ideal case but they > >> are > >> pretty close to what you should be able to accomplish on an Instance. > >> Multi-Threaded instances are currently only supported for Java; we are > >> planning support for Python later this year. > > >> Q: Why is Google charging for instances rather than CPU as in the old > >> model? Were customers asking for this? > >> A: CPU time only accounts for a portion of the resources used by App > >> Engine. When App Engine runs your code it creates an Instance, this is a > >> maximum amount of CPU and Memory that can be used for running a set of your > >> code. Even if the CPU is not currently working due to waiting for > >> responses, the instance is still resident and considered “in use” so, > >> essentially, it still costs Google money. Under the current model, apps > >> that have high latency (or in other words stay resident for long periods of > >> time without doing anything) are not able to scale because it would be > >> cost-prohibitive to Google. So, this change is designed to allow > >> developers > >> to run any sort of application they would like but pay for all of the > >> resources that are being used. > > >> Q: What does this mean for existing customers? > >> A: Many customers have optimized for low CPU usage to keep bills low, but > >> in turn are often using a large amount of memory (by having high latency > >> applications). This new model will encourage low latency applications even > >> if it means using larger amounts of CPU. > > >> Q: How will Always On work under the new model? > >> A: When App Engine leaves preview all Paid Apps and Apps in Premier > >> Accounts will be able to set the number of idle instances they would like > >> to > >> have running. Always On was designed to allow an app to always have idle > >> instances running to save on instance start-up latency. For many Apps a > >> single idle instance should be enough (especially when using concurrent > >> requests). This means that for many customers, setting an App to be paid > >> will mean a $9/month minimum spend, you can then use the 24 free IH/day to > >> keep an instance running all the time by setting Min Idle Instances to be > >> 1. > > >> Q: What adjustments will be available for the new scheduler? > >> A: There will be 4 “knobs” provided in the new scheduler which will allow > >> for adjustment of performance vs. cost: > >> - Min Idle Instances: This determines how many idle instances will be left > >> running all the time in order to ensure instances are ready to go when > >> there > >> is a need based on the traffic. NOTE: This option is only available to > >> Paid > >> Apps and Apps for Premier Accounts. > >> - Max Idle Instances: This determines the maximum number of idle > >> instances the scheduler will leave running to be ready for traffic. > >> Lowering this value can save money, but if traffic is spikey it could mean > >> repeated start-up times and costs > >> - Min Pending Latency: This is the... > > read more » -- You received this message because you are subscribed to the Google Groups "Google App Engine" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.
