Hi sorry to jump back to this mail as I know the v9 patches are in flight. I was on vacation for the last few days but my responses are inline. I'll try to look at the revised patches also but I think you have addressed most of my concerns.
> -----Original Message----- > From: Aaron Conole [mailto:acon...@redhat.com] > Sent: Thursday, February 11, 2016 6:13 PM > To: Mooney, Sean K <sean.k.moo...@intel.com> > Cc: Traynor, Kevin <kevin.tray...@intel.com>; dev@openvswitch.org; > Flavio Leitner <f...@sysclose.org> > Subject: Re: [ovs-dev] [PATCH v8 0/5] Convert DPDK configuration from > command line to DB based > > Hi Sean, > > "Mooney, Sean K" <sean.k.moo...@intel.com> writes: > > > Overall I like this change but as a user of ovs with dpdk and A > > maintainer of a plugin to deploy it I have some comments inline. > > AWESOME! Seriously, I will probably be bothering you a lot because you > are the person I want to hear from the most on this patchset. :) > > >> -----Original Message----- > >> From: Aaron Conole [mailto:acon...@bytheb.org] > >> Sent: Wednesday, February 10, 2016 1:23 PM > >> To: Traynor, Kevin > >> Cc: dev@openvswitch.org; Flavio Leitner; Mooney, Sean K > >> Subject: Re: [ovs-dev] [PATCH v8 0/5] Convert DPDK configuration from > >> command line to DB based > >> > >> "Traynor, Kevin" <kevin.tray...@intel.com> writes: > >> >> -----Original Message----- > >> >> From: Aaron Conole [mailto:acon...@redhat.com] > >> >> Sent: Friday, January 29, 2016 5:57 PM > >> >> To: dev@openvswitch.org > >> >> Cc: Flavio Leitner <f...@sysclose.org>; Panu Matilainen > >> >> <pmati...@redhat.com>; Traynor, Kevin <kevin.tray...@intel.com>; > >> >> Zoltan Kiss <zoltan.k...@linaro.org>; Christian Ehrhardt > >> >> <christian.ehrha...@canonical.com> > >> >> Subject: [PATCH v8 0/5] Convert DPDK configuration from command > >> >> line to DB based > >> >> > >> >> Currently, configuration of DPDK parameters is done via the > >> >> command line through a --dpdk **OPTIONS** -- command line > >> >> argument. This has a number of challenges, including: > >> >> * It must be the first option passed to ovs-vswitchd > >> >> * It breaks from the way most other things are configured in OVS > >> >> * It doesn't allow an easy way to populate defaults > >> >> > >> >> > >> >> This series brings the following changes to openvswitch: > >> >> * All DPDK options are taken from the ovs database rather than the > >> >> command line > >> >> * DPDK lcores are optionally auto-assigned to a single core based > >> >> on > >> the > >> >> bridge coremask. > >> >> * Updated documentation > >> >> > >> >> v2: > >> >> * Dropped the vhost-user socket configuration options. Those can > >> >> be > >> re-added > >> >> as an extension > >> >> * Incorporated feedback from Kevin Traynor. > >> >> > >> >> v3: > >> >> * Went back to a global dpdk-init > >> >> * Language cleanup and various minor fixes > >> >> > >> >> v4: > >> >> * Added a way to pass arbitrary eal arguments > >> >> > >> >> v5: > >> >> * Restore the socket-mem default, and fix up the ovs-dev.py > >> >> script, > >> along > >> >> with the manpage for ovsdb-server > >> >> > >> >> v6: > >> >> * Correct a documentation issue with INSTALL.DPDK.md > >> >> * Correct a non-dpdk enabled OVS incorrect warning variable > >> >> * Remove an excess whitespace > >> >> > >> >> v7: > >> >> * After testing by Christian with dpdk-alloc-mem > >> >> > >> >> v8: > >> >> * Confirmed ``make check`` operation with and without dpdk. > >> >> Retested on live-host > >> > > >> > Hi, > >> > > >> > I've done some testing on this patchset and I couldn't find any > >> > issues. > >> > >> Cool; does that mean I have your Tested-by? :) > >> > >> > - tested that -c and -n defaults and explicit values are catered > >> > for > >> > - tested dpdk-init=t/f leads to dpdk initialization or not > >> > - tested that use of both dpdk-socket-mem and dpdk-alloc-mem is > >> > caught > >> > - tested that a string can be passed in through extra_args > >> > - tested the code won't catch using a db entry dpdk-socket-mem and > >> also > >> > putting --socket-mem in extra_args, however dpdk will barf > >> > > >> > On command line args vs. db entries vs. a string of args in the db, > >> > if there is doubt on this then let's debate further. This will > >> > change how ovs with dpdk is used, so better debate it out and get > it right. > >> > >> I don't think there's any real doubt. I think the approach is the > >> best way to do this. I have agreement from almost everyone else, I > think? > >> Anyone still need to be convinced? > > From an orchestration point of view the explcit values are nice but > > form a Deployment point of view I favor only having > > dpdk_init:true/false and dpdk_extra:<cmd line sting> > > Okay, but I will point out an advantage of doing it the other way means > saving off DBs and replicating them in other environments. While it's > true you can do that with just the dpdk_extra flag, it means you have to > read the extras, find the exact part in the string to change, and write > back. With enumerated db entries, you can just modify those things you > care to change. I think it's just a 'getting used to it' rather than > 'omg a workflow has changed', but that's only my opinion. > > > In particular if is set a value in dpdk_extra and in the future you > > add a explicit entry in the db to Hold that value I never want it to > > brake my deployment code which still uses dpdk_extra. Where db entries > > Exist for an item it would make my life eaiser if they were always set > > automatically when the vswitchd starts > > Okay, I've rebased my series and included this as patch 5/6. Just doing > some brief testing now, and I'll repost to the list. > > > e.g. if I never spcify the vhostuser_socket_dir it would be nice if it > > was set by ovsvswitchd to the compile time default. > > Similar if I override the default of a value in dpdk_extra then it > > would be nice if when an explicit value for that item exits In the db > > that it be updated to match the value specified in the dpdk_extra > args. > > I would rather not try to do this. Instead, my approach is emit a > warning that the value in the dpdk_extras string was used, and let the > user do such management. There's only so much intelligence that I think > OVS should provide. For context I am considering the usecase where ovs is managed via openstack, either directly or via odl. In the odl case OpenStack neutron does not have any direct access to the host or the ovs instance that it is running. As a result neutron can only discover information about how ovs is configured by querying the odl topology api Which has a copy of the ovsdb tables. As such I cannot programmatically retrieve that warning and use it to determine which value to use. This is not an issue provided there is a rule that either the explicit db entry or the dpdk_extra string takes precedence I just need to know which one to prefer. The reason I would like this to always be set is, if I have ovs with dpdk support and I am running it with the defaults install from the package manager e.g. I do not Manually configure the vhostuser_socket_dir, I would still like it to be reported in the ovsdb. The issue that I would like to address is that the vhostuser_socket_dir default location depends on os as it is tied to the default rundir location. On fedora 20+ it is /var/run/openvswitch on centos 7 and Ubuntu 14.04+ it is /usr/var/run/openvswitch. When ovs is managed via the OpenStack neutron agent we added a per host config option to the agent to specify the vhostuser_socket_dir. This allows openstack neutron to tell nova what the path should be when spawning a vm. In the case of odl there is no local agent on the node and also no config file. Currently as there is no way to discover the vhostuser_socket_dir via ovsdb or openflow we are forced to assume it is /var/run/openvswitch and that either ovs was compiled with --with-rundir=/var/run/openvswitch or that it is started with -vhost_sock_dir /var/run/openvswitch If the vhostuser_socket_dir was always reported in the ovsdb then we could always Read this and support multiple distros more easily. We can work around this by falling back to /var/run/openvswitch if the value is not found in the ovsdb but where these values have compilation Time defaults it would be nice if they were reflected in the ovsdb also. Maybe the compile Time default could be set as part of ovsdb-tool create? That way the ovsvswitchd can be kept simpler As it would not have to set them. > > > If this is not the case I cannot true the explicit values in the ovsdb > > as they may not be valid. > > Not sure what you mean here. Are you saying that you won't know where to > look to get the real values? I agree - that's why I emit the warning > saying where the value comes from and hope that such log will be > watched. Does that sound unreasonable? I did not phrase that well. Basically if a key is set both in the dpdk Extra string and as an explicit value I need to know which one reflects the Actual runtime configuration of ovs. E.g. always prefer the explicit value Or always prefer the dpdk_extra value. > > >> > >> > There's one or two of the db entries that may be able to reused > >> > later for other things e.g. vhostuser socket location, so that > >> > would be a + > >> for them. > >> > Backwards compatibility would be a + for command line args. Daniele > >> > has mentioned scripting also. I'm sure there's other +/-'s. > >> > >> I don't know - scripting vswitchd? I think that sounds a little > >> strange; > > It depend on your usecase. I am a developer on neutron integrating ovs > > with dpkd with Openstack. I also maintain a devstack plugin to deploy > > ovs with dpdk from source. > > > > In keeping with other service in devstack we run the ovsvswitchd in a > > screen session. > > https://github.com/openstack/networking-ovs-dpdk/blob/master/devstack/ > > ovs-dpdk/ovs-dpdk-init#L528 we heavily script the compilation and > > deployment of ovs and treat it like any other ephemeral service. > > Our ovs-dpdk service file is 700 lines long and we have separate logic > > to compile and install it. > > > > In production sure it won't be started and stopped frequently but when > > used in a development environment It can have a very sort lifetime. > > Wow. That sounds kindof horrible, actually. There's a script included > with OVS which compiles and starts OVS (see utilities/ovs-dev.py). I > think that should be preferred, and will let you specify the arguments > you wish, I think. Actually our code to build and install ovs is only about 30 lines so the utilities/ovs-dev.py script would not dramatically reduces our maintenance burden in that respect but I will look into it. the original deployment code we have predates dpdk support in ovs and utilities/ovs-dev.py by several months as we used it internally to deploy ovs-dpdk when dpdk support was being added. A large portion of the complexity in those 700 lines exist because of how dpdk physical port have to be added to ovs. Adding dpdk phy ports is my number 1 useablity issue with ovs-dpdk. The current condensed workflow is as follows Map netdev names to pci address for each interface Bind interface to dpdk driver Generate ovs port name for each netdev as dpdkX where X is the index into the array Of dpdk ports sorted by pciaddress Start ovsdb Create bridges,set datapath=netdev and add port with --no-wait as ovsvswitchd is not running. And finally start the ovsvswitchd with the dpdk interface in the eal pci whitelist. Note: the whitelist is need to ensure that the array of dpdk pmds create in the eal init is the Same as the one we used earlier to calculate the dpdkX names If we could simply add an dpdk phy port while the ovsvswitchd is running By simply using the following workflow it would improve the usability greatly. Start ovsdb and ovsvswitchd normally Bind interface to dpdk driver then run ovs-vsctl add-port <bridge> <arbitrary port name> -- set interface <arbitrary port name> type=dpdk other-config="pci-address=xyz" or if I was even bolder just Start ovsdb and ovsvswitchd normally ovs-vsctl add-port <bridge> <arbitrary port name> -- set interface <arbitrary port name> type=dpdk other-config="pci-address=xyz,driver=igb_uio,olddriver=ixgbe" though I understand why ovs does not want to manage the bining/unbining of the interface to the dpdk driver but that is the hardest part to script correctly. Sorry for the mini rant but I like the direction that this patch chain is takeing ovs configuration It will improve usability with dpdk a lot. The physical port issue is just one I have been dealing with Since before ovs had support for dpdk in tree. > > Regardless, sure that script would have to change a bit, but I think > you'll want to change it anyway (there's a new user feature in OVS, plus > I'm adding support for vhost-user socket group ids, so the 'sg' will not > be the advised way of starting ovs-vswitchd, for instance). There have That really good to hear as that was an annoying pain point to work around. Will this by default allow read write access to the same group? If not I would still need to change the umask to 002. It would be nice to be able To specify the vhost-user socket permissions also in the event that the group Will not have read write access by default. > been some new feature developments with OVS that you could leverage to > get the same effect with less commands. I would suggest just changing > L511 to do the db sets there. It has the added effect of shrinking L520 > and L528. Yes this is what I had in mind. We would simply set these values when we Create the ovsdb. The actually changes required on my side are minimal. > > Not to mention -vhost_sock_dir is not actually an EAL argument (which is > why I'm changing that whole infrastructure). If that got passed to dpdk > directly, it would not know what to do with it. > > Anyway, I realize that there could be some breakage, and would happily > work with you to resolve it :) > Thanks I think the changes should be easy enough. The only tricky bit is ensuring I can deploy ovs commits before and after this Change with the same init script. I am thinking of just greping the vswitch.ovsschema File for the new dpdk_init value and then fall back to the old logic if it was not found and use the new logic if it was. > >> it isn't some kind of ephemeral service that comes and goes. And it's > >> not like this patch prevents the same kinds of arbitrary commands to > >> be passed to the EAL (since 4/5 does precisely that). The only change > >> required is doing ovs-vsctl before ovs-vswitch in the 'starting > >> vswitchd' case. Is that really a huge deal? > >> > >> There's always pros and cons. I haven't heard any explict NAK, or any > >> explicit ACK. It would be nice for that to happen, since I can't > >> maintain this series out-of-tree forever, and there are other things > >> I'd really like to get to (as fun as db entries may be). > >> > >> Any suggestions on how to move forward? I was planning on posting a > >> rebased v9 - should I still do that? > >> > >> -Aaron > >> > >> > Kevin. > >> > > >> >> > >> >> Aaron Conole (5): > >> >> netdev-dpdk: Restore thread affinity after DPDK init > >> >> netdev-dpdk: Convert initialization from cmdline to db > >> >> netdev-dpdk: Autofill lcore coremask if absent > >> >> netdev-dpdk: Allow arbitrary eal arguments > >> >> NEWS: Announce the DPDK EAL configuration change > >> >> > >> >> FAQ.md | 6 +- > >> >> INSTALL.DPDK.md | 90 ++++++++++--- > >> >> NEWS | 5 + > >> >> lib/netdev-dpdk.c | 327 > >> ++++++++++++++++++++++++++++++++++++++----- > >> >> -- > >> >> lib/netdev-dpdk.h | 22 ++- > >> >> utilities/ovs-dev.py | 7 +- > >> >> vswitchd/bridge.c | 3 + > >> >> vswitchd/ovs-vswitchd.8.in | 5 +- > >> >> vswitchd/ovs-vswitchd.c | 25 +--- > >> >> vswitchd/vswitch.xml | 128 +++++++++++++++++- > >> >> 10 files changed, 513 insertions(+), 105 deletions(-) > >> >> > >> >> -- > >> >> 2.5.0 > >> > > >> > _______________________________________________ > >> > dev mailing list > >> > dev@openvswitch.org > >> > http://openvswitch.org/mailman/listinfo/dev _______________________________________________ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev