Hi Salvatore, great email. These are exactly the kind of practical issues I want to make sure we have time to discuss in detail at the summit. Comments inline.
dan On Mon, Sep 19, 2011 at 5:20 AM, Salvatore Orlando < salvatore.orla...@eu.citrix.com> wrote: > Hi all, **** > > ** ** > > Before submitting a formal proposal for Quantum API v1.1, I think it might > be worth discussing it with all the parts involved.**** > > ** ** > > In earlier meeting we agreed (please correct me if I’m wrong) that the we > aimed at delivering a production-ready version for Quantum in the Essex > release cycle. > I definitely know of people planning on pre-production + production deployments well in advance of the final Essex release, so I definitely agree. I see having large-scale production deployments as a critical validation of our platform (not to mention an important step in becoming a full-fledged OpenStack project). > **** > > Of course, in the case of Quantum this implies that we need both a > production-ready service interface as well as at least a production ready > plugin. **** > > In this thread I would like to focus on the improvements we need to make to > the Quantum API for making it production ready. I also think AuthN/AuthZ > deserves a separate discussion thread, so I’m not going to discuss them > here.**** > > ** ** > > In my opinion, in order to achieve a production-ready API we need:**** > > **1. ***Ensure the Quantum service interface satisfies reliability, > efficiency and scalability requirements* > Defining these requirements is probably the most important part of the job > here. However, IMHO, the following might represent some general guidelines > to achieve these goals:**** > > **· **Set up a CI system with “real” plugins in order to find > ensure the API and the plugin interface are bug free; > Yup, I think Carl is organizing a session on this. I also believe some folks are Rackspace are planning on setting up "smoketest" environments, and we'd definitely like to be a part of that. > **** > > **· **Implement HA for Quantum; > Definitely a good topic to discuss. Are you talking about being able to run multiple simultaneous instances of a Quantum service on different hosts for redundancy, or simply making sure the Quantum service is restarted if it ever dies? So far, I've been thinking of the former as more of a plugin-specific task, since the plugin owns the data model. For the latter, my rough thinking has been that the packaging can leverage the same approach that is used for high-availability with other nova services, likely something like monit. > **** > > **· **Profile Quantum API calls in order to understand which ones > might be made more efficient either by improving the code, adding mechanisms > such as caches, or re-architecting them;**** > > **· **Perform stress testing aimed at identifying scalability > bottlenecks in the Quantum API > This will definitely be valuable. I would add that the stress testing would ideally include not just Quantum, but also a nova + Quantum setup to measure the impact of the Nova <-> Quantum communication channel on the overall system performance + scalability. I have heard concerns about the additional communication overhead caused by Quantum + Melange being separate services, so it will be useful to quantify this overhead and see where improvements are needed. Batching is one suggestion I've heard as well. > **** > > ***2. ****Improve the Quantum API in a way that it would be easier > for client application to consume it* > > Putting this in another way, we could say that we want to bring Quantum API > on a par with the Openstack API, possibly in the following way:**** > > **· **Provide the capability of specifying filters on API requests > **** > > **· **Support paginated collections on responses**** > > **· **Provide ATOM links to resources on response**** > > **o **Also make them permanent and version-independent**** > > **· **Add a Rate Limiting middleware layer > All very valuable. I also want to make sure we focus on a couple other API related areas: - Finalizing how we authenticate attachments at the logical layer. Until we do this, it is not possible to have a tenant drive the entire network provisioning process. - Improving the Nova API with respect to VIF creation. - Discussing how existing network-related functionality in nova (dhcp, floating IPs, L3 gateway, metadata server, security groups) interact with Quantum (this is a pretty big discussion itself). > **** > > ** ** > > What are your opinions. Do you think it would make sense to focus on > reliability, efficiency, and scalability as well as ensuring the Quantum API > can be easily consumed by client applications? > I think these things will be key. While we've made a ton of progress in 6 months starting from scratch, we still have a lot of work to go to make Quantum a reliable, scalable, and supportable system for production environments. > **** > > ** ** > > Also, the other topic I reckon we need to discuss is whether we want to > extend the core of the Quantum API. This will revert us back to the > discussion of core vs. extensions that we had before the Diablo design > summit.**** > > For Diablo, we agreed to start with smallest possible core and then > starting to “expand it”. That decision made completely sense for Diablo; we > should now decide whether we want to expand it or keep it small for the > Essex release cycle.**** > > ** ** > > As usual, there are pro and cons to be weighted. Here’s my first attempt to > them on a scale:**** > > ** ** > > In Favour of extending core:**** > > **· **If Essex will be “production-ready” we might prefer to have > a larger core with some non-implemented API rather than a smaller core which > would be extended for F,G, and so on. A small core with a high number of > extension will be probably not ideal for production ready releases, as most > clients will need to explicitly use plugin-specific extensions.**** > > **· **Some of the currently implemented extensions, such as QoS, > and in some form the port-profile as well, could easily be integrated into > the core API, as they are general enough; it is therefore possible to assume > it is something most plugins will implement.**** > > **· **The APIs for bridging Quantum networks with other Quantum > networks or with networks running outside of Quantum, initially proposed for > Diablo, were dropped. It might be worth re-considering them for Essex,given > the fact that they are representative of a very important use case. **** > > ** ** > > Against extending core:**** > > **· **Before adding a feature into the core API, the feature > itself and the APIs for using it should be thoroughly reviewed and agreed by > the community. A feature should therefore first spend some time “incubating” > as an extension before being “approved” for core.**** > > **· **We don’t want the API to bloat with many new features for a > release which is supposed to be stable. The smaller the core API, the easier > stabilisation will be. > I think this is a good list of pros/cons. This is a big part of what I wanted to discuss in the "phase 2" section. I think we're generally in agreement. To perhaps flush out any discrepancies, here are some of my high-level thoughts: Two things I REALLY want to avoid are: - Quantum API discussions becoming an "standard body" were people spend more time haggling over standards than working on code. - A world were proposing an API first means that this abstraction should be the default for a "standard" Quantum API. This risks encouraging people to spend all their cycles on defining APIs rather than a healthy balance of designing APIs and building a high-quality system. I believe the merit of an API extension will become obvious based on the pervasiveness of its use in production quality environments. Probably the most successful cloud networking API abstraction to date, "security groups", was not designed by a standards body. It was proposed by one party, viewed as generally useful by users, then was implemented by other platforms who wanted to provide similar value. During the 6-month Essex period, I strongly believe that most (if not all) successful production deployments of Quantum will be the result of a cloud provider choosing to deployment OpenStack + Quantum with a particular plugin and a strong understanding of what is or is not possible with that plugin beyond the core L2 API. This is much the same way that while in theory Nova let's you swap different hypervisor backends, most folks I know choose one hypervisor, understood its operation, and decided to stick with that for the time being. I guess that's just a long way of saying that while creating a widely used set of cloud network abstractions is the long-term goal for Quantum, I don't feel we need to rush to "standardize" every higher-level API. My main short-term goal is making sure we have the mechanisms in place such that no one's offering is limited by the core L2 API. I'd strongly prefer that discussions about new "core" APIs (I'm not even sure I like that term anymore... but that's a discussion for the summit) are informed by real experience implementing and using those capabilities (and any alternative proposals) as API extensions. Another wrinkle is that I expect several new parties to arrive on the Quantum scene during the Essex time period. Allowing them to contribute is another reason for not rushing to lock down APIs. Wow, that was quite a book. I'm sure we'll have no shortage of things to discuss at the summit :) Dan > **** > > ** ** > > Regards,**** > > Salvatore**** > > ** ** > > ** ** > > ** ** > > -- > Mailing list: https://launchpad.net/~netstack > Post to : netstack@lists.launchpad.net > Unsubscribe : https://launchpad.net/~netstack > More help : https://help.launchpad.net/ListHelp > > -- ~~~~~~~~~~~~~~~~~~~~~~~~~~~ Dan Wendlandt Nicira Networks, Inc. www.nicira.com | www.openvswitch.org Sr. Product Manager cell: 650-906-2650 ~~~~~~~~~~~~~~~~~~~~~~~~~~~
-- Mailing list: https://launchpad.net/~netstack Post to : netstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~netstack More help : https://help.launchpad.net/ListHelp