Re: [ovs-dev] Design notes for provisioning Netlink interface from the OVS Windows driver (Switch extension)

Samuel Ghinet Fri, 08 Aug 2014 08:28:41 -0700

Hello Eitan,

[QUOTE]Transaction based DPIF primitives are mapped into synchronous  device 
I/O control system calls.
The NL reply would be returned in the output buffer of the IOCTL 
parameter.[/QUOTE]
I am still confused. You spoke in the design file about 
"nl_sock_transact_multiple", which could be implemented as ReadFile and 
WriteFile. And that "these DPIF commands are mapped to nl_sock_transact NL 
interface to  nl_sock_transact_multiple."
Do you mean we will no longer use nl_sock_transact_multiple in userspace for 
these DPIF transactions?


[QUOTE]>You mean, whenever, say, a Flow dump request is issued, in one reply to 
give back all flows?
Not necessarily. I meant that the driver does not have to maintain the state of 
the dump command.
Each dump command sent down to the driver would be self-contained. [/QUOTE]
We currently have this in our implementation. The only thing 'left' would be 
the fact that we provide all the output buffer for dump at once. The userspace 
can read sequentially from it. Unless there is a reason to write sequentially 
from the kernel to the userspace, and wait for the userspace to read, I think 
that how we have this one is ok.

[QUOTE]Yes, these are OVS events that are placed in a custom queue.
There is a single Operating System event associated with the global socket 
which collects all OVS events.
It will be triggered through a completion of a pending I/O request in the 
driver.[/QUOTE]
I used to be a bit confused of your implementation in OvsEvent and OvsUser. 
Perhaps this discussion would
clarify a bit more things. :)
Ok, so we'll hold OVERLAPPED structs in the kernel, as events. What kind of IRP 
requests would be returned as "pending" in the kernel? Requests coming as 
"nl_sock_recv()" on the multicast groups?
Will there be multiple multicast groups used? or all multicast operations would 
queue events on the same event queue, where all the events are read from the 
same part of code in userspace?

How exactly are events queued by the kernel associated with the userspace? I 
mean, how do you register a "nic connected" event so that when an event 
happens, you know you need to update userspace data for a nic, not do something 
else. Would there be IDs stored in the OvsEvent structs that would specify what 
kind of events they are? Would we also need context data associated with these 
events?

[QUOTE]>However, I think we need to take into account the situation where the 
userspace might be providing a smaller buffer than it is the total to read. 
Also, I think the "dump" mechanism requires it.
I (want) to assume that each transaction is self-contained which means that the 
driver should not maintain a state of the transaction. Since, we will be using 
an IOCTL for that transaction the user mode buffer length will be specified in 
the command itself.
All Write/Read dump pairs are replaced with a single IOCTL call.[/QUOTE]
That still did not answer my question :)
You mean to use a very large read buffer, so that you would be able to read all 
in one single operation? I am more concerned here about flow dumps, because you 
may not know whether you need an 1024 bytes buffer or an 10240 byes buffer, or 
an 102400 bytes buffer, or etc.
So I do not see how a DeviceIoControl operation could do both the 'write' and 
the 'read' part for the dump.
If you pass to the DeviceIoControl a buffer length = 8000, and the flow dump 
reply buffer is 32000 bytes, you need to do additional reads AND maintain state 
in the kernel (e.g. offset in the kernel read buffer).

[QUOTE]As I understand transactions and dump are (as used for DPIF) are not 
really socket operation per se.[/QUOTE]
They are file / device operations.

[QUOTE]o) I believe we shouldn't use the netlink overhead (nlmsghdr, 
genlmsghdr, attributes) when not needed (say, when registering a KEVENT 
notification) , and, if w>e choose not to use netlink protocol always, we may 
need a way to differentiate between netlink and non-netlink requests.
Possible, as phase for optimization[/QUOTE]
Not necessarily: if we can make a clear separation in code between netlink and 
non-netlink km-um, not using netlink where we don't need to might save us some 
development & maintainability effort - both in kernel and in userspace. Because 
otherwise we'd need to turn non-netlink messages of (windows) userspace code 
into netlink messages.

Sam
________________________________________
From: Eitan Eliahu [elia...@vmware.com]
Sent: Thursday, August 07, 2014 8:57 PM
To: Alin Serdean; dev@openvswitch.org; Rajiv Krishnamurthy; Ben Pfaff; Kaushik 
Guha; Ben Pfaff; Justin Pettit; Nithin Raju; Ankur Sharma; Samuel Ghinet; Linda 
Sun; Keith Amidon
Subject: RE: Design notes for provisioning Netlink interface from the OVS 
Windows driver (Switch extension)

Hi Alin, yes, we want to exercise the interface when OVS is running. For 
example we would like to dump the flow table is not empty.

On the other issue (NBL with multiple NBs, Github issue #10) I think we need to 
talk how to support it.  After you came across this issue we even know how to 
produce this case :-)
Thanks,
Eitan

-----Original Message-----
From: Alin Serdean [mailto:aserd...@cloudbasesolutions.com]
Sent: Thursday, August 07, 2014 10:50 AM
To: Eitan Eliahu; dev@openvswitch.org; Rajiv Krishnamurthy; Ben Pfaff; Kaushik 
Guha; Ben Pfaff; Justin Pettit; Nithin Raju; Ankur Sharma; Samuel Ghinet; Linda 
Sun; Keith Amidon
Subject: RE: Design notes for provisioning Netlink interface from the OVS 
Windows driver (Switch extension)

Hi Eithan,

Do you have any particular reason to support both devices for start instead of 
focusing on the Netlink interface?

On the patches progressing a bit slower than expected spent a bit too much time 
on the issue 
https://urldefense.proofpoint.com/v1/url?u=https://github.com/openvswitch/ovs-issues/issues/10&k=oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0A&r=yTvML8OxA42Jb6ViHe7fUXbvPVOYDPVq87w43doxtlY%3D%0A&m=EXUTxzeugqhErQ2Fi%2BVBW0vsf89O8ECLmGTTV1lNZX0%3D%0A&s=70afc8a789fbcf2c754809a1f9d1246227250b279dd35e740bbb31b34544bc6b
 but I may have an idea which I would like to talk about in the next meeting. I 
plan to work in the weekend though to get so we can be one step close to our 
goal :).

Alin.

-----Mesaj original-----
De la: Eitan Eliahu [mailto:elia...@vmware.com]
Trimis: Thursday, August 7, 2014 3:19 AM
Către: Alin Serdean; dev@openvswitch.org; Rajiv Krishnamurthy; Ben Pfaff; 
Kaushik Guha; Ben Pfaff; Justin Pettit; Nithin Raju; Ankur Sharma; Samuel 
Ghinet; Linda Sun; Keith Amidon
Subiect: RE: Design notes for provisioning Netlink interface from the OVS 
Windows driver (Switch extension)


Hi Alin,
The driver which is currently checked in (the original one) supports the DPIF 
interface  through a device object registered with the system. This driver 
works with a private version of user mode OVS (i.e. dpif-windows.c). The 
secondary device would be a second device object which supports the Nelink 
interface. For the initial development phase both devices will be instantiated 
and registered in the system. Thus, we could bring up all transaction and dump 
based DPIF commands over the Netlink device while the system is up and running.

For clarity, let's call the "original device" the "DPIF device" and the 
"secondary device" the "Netlink device".
Eitan

-----Original Message-----
From: Alin Serdean [mailto:aserd...@cloudbasesolutions.com]
Sent: Wednesday, August 06, 2014 4:28 PM
To: Eitan Eliahu; dev@openvswitch.org; Rajiv Krishnamurthy; Ben Pfaff; Kaushik 
Guha; Ben Pfaff; Justin Pettit; Nithin Raju; Ankur Sharma; Samuel Ghinet; Linda 
Sun; Keith Amidon
Subject: RE: Design notes for provisioning Netlink interface from the OVS 
Windows driver (Switch extension)

Hi Eitan,

> C. Implementation work flow:
> The driver creates a device object which provides a NetLink interface
> for user mode processes. During the development phase this device is created 
> in addition to the existing DPIF device. (This means that the bring-up of the 
> NL based user mode can be done on a live kernel with resident DPs, ports and 
> flows) All transaction
> and dump based DPIF functions could be developed and brought up when the NL 
> device is a secondary device (ovs-dpctl show and dump XXX should work). After 
>    > the initial phase is completed (i.e. all transaction and dump based DPIF 
> primitives are implemented), the original device interface will be removed 
> and packet and
> event propagation path will be brought up (driven by vswicth.exe)

Could you, please explain a bit more what does original/secondary device mean?

Ty!
Alin.
_______________________________________________
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Re: [ovs-dev] Design notes for provisioning Netlink interface from the OVS Windows driver (Switch extension)

Reply via email to