Hi Eric,
Please see my responses inline. On an unrelated note, thanks for
the pointer to the GPU spec
(https://review.openstack.org/#/c/579359/10/doc/source/specs/rocky/device-passthrough.rst).
I will review that.
On 7/31/2018 10:42 AM, Eric Fried wrote:
Sundar-
* Cyborg drivers deal with device-specific aspects, including
discovery/enumeration of devices and handling the Device Half of the
attach (preparing devices/accelerators for attach to an instance,
post-attach cleanup (if any) after successful attach, releasing
device/accelerator resources on instance termination or failed
attach, etc.)
* os-acc plugins deal with hypervisor/system/architecture-specific
aspects, including handling the Instance Half of the attach (e.g.
for libvirt with PCI, preparing the XML snippet to be included in
the domain XML).
This sounds well and good, but discovery/enumeration will also be
hypervisor/system/architecture-specific. So...
Fair enough. We had discussed that too. The Cyborg drivers can also
invoke REST APIs etc. for Power.
Thus, the drivers and plugins are expected to be complementary. For
example, for 2 devices of types T1 and T2, there shall be 2 separate
Cyborg drivers. Further, we would have separate plugins for, say,
x86+KVM systems and Power systems. We could then have four different
deployments -- T1 on x86+KVM, T2 on x86+KVM, T1 on Power, T2 on Power --
by suitable combinations of the drivers and plugins.
...the discovery/enumeration code for T1 on x86+KVM (lsdev? lspci?
walking the /dev file system?) will be totally different from the
discovery/enumeration code for T1 on Power
(pypowervm.wrappers.ManagedSystem.get(adapter)).
I don't mind saying "drivers do the device side; plugins do the instance
side" but I don't see getting around the fact that both "sides" will
need to have platform-specific code
Agreed. So, we could say:
- The plugins do the instance half. They are hypervisor-specific and
platform-specific. (The term 'platform' subsumes both the architecture
(Power, x86) and the server/system type.) They are invoked by os-acc.
- The drivers do the device half, device discovery/enumeration and
anything not explicitly assigned to plugins. They contain
device-specific and platform-specific code. They are invoked by Cyborg
agent and os-acc.
Are you ok with the workflow in
https://docs.google.com/drawings/d/1cX06edia_Pr7P5nOB08VsSMsgznyrz4Yy2u8nb596sU/edit?usp=sharing
?
One secondary detail to note is that Nova compute calls os-acc per
instance for all accelerators for that instance, not once for each
accelerator.
You mean for getVAN()?
Yes -- BTW, I renamed it as prepareVANs() or prepareVAN(), because it is
not just a query as the name getVAN implies, but has side effects.
Because AFAIK, os_vif.plug(list_of_vif_objects,
InstanceInfo) is *not* how nova uses os-vif for plugging.
Yes, the os-acc will invoke the plug() once per VAN. IIUC, Nova calls
Neutron once per instance for all networks, as seen in this code
sequence in nova/nova/compute/manager.py:
_build_and_run_instance() --> _build_resources() -->
_build_networks_for_instance() --> _allocate_network()
The _allocate_network() actually takes a list of requested_networks, and
handles all networks for an instance [1].
Chasing this further down:
_allocate_network --> _allocate_network_async()
--> self.network_api.allocate_for_instance()
== nova/network/rpcapi.py::allocate_for_instance()
So, even the RPC out of Nova seems to take a list of networks [2].
[1]
https://github.com/openstack/nova/blob/master/nova/compute/manager.py#L1529
[2]
https://github.com/openstack/nova/blob/master/nova/network/rpcapi.py#L163
Thanks,
Eric
//lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Regards,
Sundar
__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: [email protected]?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev