I agree we are definitely on the same page.
Please see my reply to your comments below.

Salvatore

From: Dan Wendlandt [mailto:d...@nicira.com]
Sent: 05 August 2011 16:14
To: Salvatore Orlando
Cc: netstack@lists.launchpad.net
Subject: Re: [Netstack] Quantum API resource status proposal

Great Salvatore, I think we're largely on the same page, a few comments below.

dan
On Fri, Aug 5, 2011 at 2:34 AM, Salvatore Orlando 
<salvatore.orla...@eu.citrix.com<mailto:salvatore.orla...@eu.citrix.com>> wrote:
Thanks for your comments Dan.
Replies inline.

Cheers,
Salvatore

From: Dan Wendlandt [mailto:d...@nicira.com<mailto:d...@nicira.com>]
Sent: 05 August 2011 01:25
To: Salvatore Orlando
Cc: netstack@lists.launchpad.net<mailto:netstack@lists.launchpad.net>
Subject: Re: [Netstack] Quantum API resource status proposal

Great write-up Salvatore, comments inline.

dan
On Tue, Aug 2, 2011 at 4:34 PM, Salvatore Orlando 
<salvatore.orla...@eu.citrix.com<mailto:salvatore.orla...@eu.citrix.com>> wrote:
Following the discussion on synchronous vs asynchronous behaviour in the last 
Netstack meeting, we agreed to send around a proposal for introducing the 
concept of "Resource Status".

Goal:

*         Ensure consistent behaviour of the API w.r.t. users, meaning that the 
API should always behave in the same way regardless of the plugin used.
Assumptions:

*         Plugins can implement any kind of behaviour: synchronous, or 
asynchronous. It is also possible to have plugins for which some operations are 
synchronous and other asynchronous;

*         Both the API and the plugin interface should be kept as simple as 
possible; we do not want unnecessary complexity;

*         API users need a mechanism to know whether an operation is completed 
or not.

*         A resource can "exist" but not be available yet, as the plugin might 
asynchronously perform all the operations required to provide it.

Example (synchronous plugin)
"Create network" on the OVS plugin adds a vlan binding into its own database 
and then return id and name of the newly created network

Example (asynchronous plugin)
Just guessing here...
A plugin for 802.1qbh will probably have a more complex provisioning process, 
as it will involve configuring virtual switches, IV modules, and physical 
switches.
The "Create network" operation will create an entry into the plugin db, start 
the provisioning process, and immediately return. The network will not yet be 
available.

To further emphasize the point, even in the rather simple OVS plugin the 
operation of "attaching" an interface to a port is asynchronous.  The plugin 
creates a new port row in the database, and the agent on the hypervisor 
monitors the DB for changes and asynchronously puts the port on that VLAN.  I 
suspect that pretty much all plugins will have some amount of asynchronous 
behavior.



Proposal:

The proposal simply consists of adding an extra "status" attribute to all the 
resource managed by the Quantum API, namely: network, port, and attachment. The 
status attribute will be of an enumeration type describing the current state of 
the resource, similarly to the homonymous attribute for the "Server" resource 
in the Openstack API. The actual values for the enumeration could be different 
for each resource.

Possible Status enumeration values:

NETWORK: "PROVISIONING","AVAILABLE","ERROR".
PORT: "PROVISIONING","DOWN", "ACTIVE","ERROR".
ATTACHMENT: "PROVISIONING", "AVAILABLE", "ERROR"

Note: the "Port" resource already has a "state" attribute, which describes the 
administrative state of the port.  The proposed enumeration merges this 
attribute into the "status" attribute.

Adding the "status" attribute will enable users to be aware of the current 
provisioning state of the resource. Client application can then query the 
resource (e.g.: by polling it with GET requests), and check for its status. 
When the resource is available, they can perform operations on it.  A 
synchronous plugin should directly mark the resource as available when it 
returns.

I am in favor of a status field.  I want to make sure we're thinking about more 
than just provisioning though.  Provisioning is one case in which the "logical" 
connectivity described in the API has not yet been mapped to the real world, 
but there can be other cases as well.  For example, if your solution has an 
agent on the hypervisor and that agent no longer is running, how do you 
indicate the fact that this port is no longer "controlled" by quantum.  I've 
spent a lot of time working with systems like this, and in my experience these 
types of situations are actually pretty common.  A similar case is how do we 
indicate if someone has specified an interface-id, but the system does not see 
any ports with that interface id?  An abstraction that I like for this is that 
of a "logical link status".  If this status is "up", the logical connectivity 
described via the API is the connectivity the VM interface sees.  If it is 
down, there is a mismatch.  The plugin could then return a "link_status_string" 
indicating why the status is down (e.g., "interface-id not found").

[SALVATORE]: I find your comment really interesting. In my previous experience 
I too always separated the concepts of "administrative" and "operational" 
states. I quite mixed them up in my proposal in order to keep API changes to a 
minimum, but I agree that there's probably a case for having this distinction, 
at least for the "Port" resource.

For a port I would see an administrative state as either 'ACTIVE' or 'DOWN' (as 
it is today), and then another attribute, logical or operational state which is 
somewhat similar to the "link" led on a physical network switch.

Yup, this is exactly how I was thinking about it.

The operation state could be one, IMHO, one of the following:

BUILD - the port is being provisioned
UPDATING - the port's configuration is being updated
ERROR - something wrong with the port and connectivity is not guaranteed. This 
could be for one of the cases you listed. An error string returned by the 
plugin (e.g.: link_status_string) could also be returned by the API
READY_LINKDOWN - port correctly configured and working properly, no attachment 
plugged
READY_LINKUP - port correctly configured and working properly, attachment 
plugged, connectivity present.

These make sense.


An alternative scheme would be having three states: ADMINISTRATIVE, 
OPERATIONAL, LOGICALLINK.

The API will reject operations on resources which are not available, unless 
they are GET operations.

As I mentioned in the IRC chat, I think this one requires some more discussion. 
 Would I be able to create a port, attach an interface to it, and configure a 
QoS policy right away, even if provisioning took a while?  This seems 
potentially cumbersome for the client.  If the client wanted to wait until the 
port was in a particular state before applying additional configuration, it 
would seem like the client code to do that would be fairly simple, but I 
suspect other clients would prefer just to fire off the configuration without 
waiting.

[SALVATORE]: Good point. Also in this case the choice is between enforcing the 
behaviour at the API layer, or leaving it to the plugin. I want to be 
consistent with the other choices we made in the past, and I'll say I want the 
decision to be left the plugin, but the API has to be consistent wrt client 
applications. So I'd say that each operation could throw a "RESOURCENOTREADY" 
error. When a client receives this error, it means it has to poll the resource 
until its status becomes "READY" before submitting the request.

I agree that the API should be consistent across plugins, so in this case I was 
not actually advocating that we leave the decision up to the plugin so much as 
wel leave it up to the API client.  For example,  if the API client didn't want 
to configure a filtering policy until a port was done being provisioned, it 
could poll on the port link status until that status was 'AVAILABLE', then 
submit its port operation.  It seems to me that such polling is equivalent to 
what would happen if the port operations returned 'RESOURCENOTREADY', as the 
client would essentially have to keep retrying until the port operation did not 
return an error.

Stepping back, maybe the issue is that I don't understand your underlying 
motivation for not allowing an API client to perform an operation on a 
temporarily unavailable port (as long as the API gives the client a means of 
detecting that the port is unavailable such that the client could perform this 
check itself if it likes).  I suspect that if we first get on the same page 
about that, the rest of the design decisions will fall out cleanly :)

By assuming the API would reject operations in resources which were not ready I 
was implicitly enforcing the behaviour at the API layer.
I was assuming that a plugin would not be able to perform an operation on a 
resource if that resource was not ready (ready = logical link status is 'UP').  
However, I realized this is probably a simplistic assumption as a plugin might 
have the capability of queuing requests for resources temporarily not available.

This why in my second proposal I agreed not to block API requests anymore, but 
simply introduce the RESOURCENOTREADY error which CAN be raised by an API 
operation.
As you said, retrying the operation until you don't get a RESOURCENOTREADY and 
polling until the status of the resource is AVAILABLE are exactly the same 
thing. I don't think it would be harmful to have both mechanisms.

In this way, if the plugin accepts operations on resources not yet available, 
the client will never see the RESOURCENOTREADY error;
On the other hand, if the plugin is unable to accept this kind of operation 
(e.g.: plugging an interface on a port still being provisioned), then the 
client will receive RESOURCENOTREADY error, and react accordingly.




Regards,
Salvatore




--
Mailing list: https://launchpad.net/~netstack
Post to     : netstack@lists.launchpad.net<mailto:netstack@lists.launchpad.net>
Unsubscribe : https://launchpad.net/~netstack
More help   : https://help.launchpad.net/ListHelp



--
~~~~~~~~~~~~~~~~~~~~~~~~~~~
Dan Wendlandt
Nicira Networks, Inc.
www.nicira.com<http://www.nicira.com> | 
www.openvswitch.org<http://www.openvswitch.org>
Sr. Product Manager
cell: 650-906-2650<tel:650-906-2650>
~~~~~~~~~~~~~~~~~~~~~~~~~~~



--
~~~~~~~~~~~~~~~~~~~~~~~~~~~
Dan Wendlandt
Nicira Networks, Inc.
www.nicira.com<http://www.nicira.com> | 
www.openvswitch.org<http://www.openvswitch.org>
Sr. Product Manager
cell: 650-906-2650<tel:650-906-2650>
~~~~~~~~~~~~~~~~~~~~~~~~~~~
-- 
Mailing list: https://launchpad.net/~netstack
Post to     : netstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~netstack
More help   : https://help.launchpad.net/ListHelp

Reply via email to