"Mooney, Sean K" <sean.k.moo...@intel.com> writes: > Hi sorry to jump back to this mail as I know the v9 patches are in flight. > I was on vacation for the last few days but my responses are inline. > I'll try to look at the revised patches also but I think you have > addressed most of my concerns.
The only real difference is patch 5/6 (your desire for dpdk-extra fallback). Thanks so much for the feedback, Sean! I hope you get a chance to try out the new series and see if it makes things easier. > >> -----Original Message----- >> From: Aaron Conole [mailto:acon...@redhat.com] >> Sent: Thursday, February 11, 2016 6:13 PM >> To: Mooney, Sean K <sean.k.moo...@intel.com> >> Cc: Traynor, Kevin <kevin.tray...@intel.com>; dev@openvswitch.org; >> Flavio Leitner <f...@sysclose.org> >> Subject: Re: [ovs-dev] [PATCH v8 0/5] Convert DPDK configuration from >> command line to DB based >> >> Hi Sean, >> >> "Mooney, Sean K" <sean.k.moo...@intel.com> writes: >> >> > Overall I like this change but as a user of ovs with dpdk and A >> > maintainer of a plugin to deploy it I have some comments inline. >> >> AWESOME! Seriously, I will probably be bothering you a lot because you >> are the person I want to hear from the most on this patchset. :) >> >> >> -----Original Message----- >> >> From: Aaron Conole [mailto:acon...@bytheb.org] >> >> Sent: Wednesday, February 10, 2016 1:23 PM >> >> To: Traynor, Kevin >> >> Cc: dev@openvswitch.org; Flavio Leitner; Mooney, Sean K >> >> Subject: Re: [ovs-dev] [PATCH v8 0/5] Convert DPDK configuration from >> >> command line to DB based >> >> >> >> "Traynor, Kevin" <kevin.tray...@intel.com> writes: >> >> >> -----Original Message----- >> >> >> From: Aaron Conole [mailto:acon...@redhat.com] >> >> >> Sent: Friday, January 29, 2016 5:57 PM >> >> >> To: dev@openvswitch.org >> >> >> Cc: Flavio Leitner <f...@sysclose.org>; Panu Matilainen >> >> >> <pmati...@redhat.com>; Traynor, Kevin <kevin.tray...@intel.com>; >> >> >> Zoltan Kiss <zoltan.k...@linaro.org>; Christian Ehrhardt >> >> >> <christian.ehrha...@canonical.com> >> >> >> Subject: [PATCH v8 0/5] Convert DPDK configuration from command >> >> >> line to DB based >> >> >> >> >> >> Currently, configuration of DPDK parameters is done via the >> >> >> command line through a --dpdk **OPTIONS** -- command line >> >> >> argument. This has a number of challenges, including: >> >> >> * It must be the first option passed to ovs-vswitchd >> >> >> * It breaks from the way most other things are configured in OVS >> >> >> * It doesn't allow an easy way to populate defaults >> >> >> >> >> >> >> >> >> This series brings the following changes to openvswitch: >> >> >> * All DPDK options are taken from the ovs database rather than the >> >> >> command line >> >> >> * DPDK lcores are optionally auto-assigned to a single core based >> >> >> on >> >> the >> >> >> bridge coremask. >> >> >> * Updated documentation >> >> >> >> >> >> v2: >> >> >> * Dropped the vhost-user socket configuration options. Those can >> >> >> be >> >> re-added >> >> >> as an extension >> >> >> * Incorporated feedback from Kevin Traynor. >> >> >> >> >> >> v3: >> >> >> * Went back to a global dpdk-init >> >> >> * Language cleanup and various minor fixes >> >> >> >> >> >> v4: >> >> >> * Added a way to pass arbitrary eal arguments >> >> >> >> >> >> v5: >> >> >> * Restore the socket-mem default, and fix up the ovs-dev.py >> >> >> script, >> >> along >> >> >> with the manpage for ovsdb-server >> >> >> >> >> >> v6: >> >> >> * Correct a documentation issue with INSTALL.DPDK.md >> >> >> * Correct a non-dpdk enabled OVS incorrect warning variable >> >> >> * Remove an excess whitespace >> >> >> >> >> >> v7: >> >> >> * After testing by Christian with dpdk-alloc-mem >> >> >> >> >> >> v8: >> >> >> * Confirmed ``make check`` operation with and without dpdk. >> >> >> Retested on live-host >> >> > >> >> > Hi, >> >> > >> >> > I've done some testing on this patchset and I couldn't find any >> >> > issues. >> >> >> >> Cool; does that mean I have your Tested-by? :) >> >> >> >> > - tested that -c and -n defaults and explicit values are catered >> >> > for >> >> > - tested dpdk-init=t/f leads to dpdk initialization or not >> >> > - tested that use of both dpdk-socket-mem and dpdk-alloc-mem is >> >> > caught >> >> > - tested that a string can be passed in through extra_args >> >> > - tested the code won't catch using a db entry dpdk-socket-mem and >> >> also >> >> > putting --socket-mem in extra_args, however dpdk will barf >> >> > >> >> > On command line args vs. db entries vs. a string of args in the db, >> >> > if there is doubt on this then let's debate further. This will >> >> > change how ovs with dpdk is used, so better debate it out and get >> it right. >> >> >> >> I don't think there's any real doubt. I think the approach is the >> >> best way to do this. I have agreement from almost everyone else, I >> think? >> >> Anyone still need to be convinced? >> > From an orchestration point of view the explcit values are nice but >> > form a Deployment point of view I favor only having >> > dpdk_init:true/false and dpdk_extra:<cmd line sting> >> >> Okay, but I will point out an advantage of doing it the other way means >> saving off DBs and replicating them in other environments. While it's >> true you can do that with just the dpdk_extra flag, it means you have to >> read the extras, find the exact part in the string to change, and write >> back. With enumerated db entries, you can just modify those things you >> care to change. I think it's just a 'getting used to it' rather than >> 'omg a workflow has changed', but that's only my opinion. >> >> > In particular if is set a value in dpdk_extra and in the future you >> > add a explicit entry in the db to Hold that value I never want it to >> > brake my deployment code which still uses dpdk_extra. Where db entries >> > Exist for an item it would make my life eaiser if they were always set >> > automatically when the vswitchd starts >> >> Okay, I've rebased my series and included this as patch 5/6. Just doing >> some brief testing now, and I'll repost to the list. >> >> > e.g. if I never spcify the vhostuser_socket_dir it would be nice if it >> > was set by ovsvswitchd to the compile time default. >> > Similar if I override the default of a value in dpdk_extra then it >> > would be nice if when an explicit value for that item exits In the db >> > that it be updated to match the value specified in the dpdk_extra >> args. >> >> I would rather not try to do this. Instead, my approach is emit a >> warning that the value in the dpdk_extras string was used, and let the >> user do such management. There's only so much intelligence that I think >> OVS should provide. > > For context I am considering the usecase where ovs is managed via openstack, > either directly or via odl. In the odl case OpenStack neutron does not have > any > direct access to the host or the ovs instance that it is running. As a > result neutron > can only discover information about how ovs is configured by querying > the odl topology api > Which has a copy of the ovsdb tables. As such I cannot programmatically > retrieve > that warning and use it to determine which value to use. This is not > an issue provided > there is a rule that either the explicit db entry or the dpdk_extra > string takes precedence > I just need to know which one to prefer. Okay, hopefully it's clear now; that also means we need to make sure that any entries always exist in the database, I guess. I don't know if that has any effect on the way I've implemented this series, but I'll ask. > The reason I would like this to always be set is, if I have ovs with dpdk > support > and I am running it with the defaults install from the package manager > e.g. I do not > Manually configure the vhostuser_socket_dir, I would still like it to > be reported in > the ovsdb. The issue that I would like to address is that the > vhostuser_socket_dir > default location depends on os as it is tied to the default rundir location. > On fedora 20+ it is /var/run/openvswitch on centos 7 and Ubuntu 14.04+ it > is /usr/var/run/openvswitch. Understood. I'll keep this in mind in the follow-on series (which addresses vhost-user directories and permissions). > When ovs is managed via the OpenStack neutron agent we added a per > host config option > to the agent to specify the vhostuser_socket_dir. This allows > openstack neutron to tell nova > what the path should be when spawning a vm. In the case of odl there > is no local agent > on the node and also no config file. Currently as there is no way to > discover the vhostuser_socket_dir > via ovsdb or openflow we are forced to assume it is > /var/run/openvswitch and that either > ovs was compiled with --with-rundir=/var/run/openvswitch or that it is > started with > -vhost_sock_dir /var/run/openvswitch > > If the vhostuser_socket_dir was always reported in the ovsdb then we could > always > Read this and support multiple distros more easily. We can work around > this by falling back to > /var/run/openvswitch if the value is not found in the ovsdb but where > these values have compilation > Time defaults it would be nice if they were reflected in the ovsdb > also. Maybe the compile > Time default could be set as part of ovsdb-tool create? That way the > ovsvswitchd can be kept simpler > As it would not have to set them. Intersting idea. I'll think about it, because it may have issue on upgrade/downgrade, but maybe I'm overthinking it. >> >> > If this is not the case I cannot true the explicit values in the ovsdb >> > as they may not be valid. >> >> Not sure what you mean here. Are you saying that you won't know where to >> look to get the real values? I agree - that's why I emit the warning >> saying where the value comes from and hope that such log will be >> watched. Does that sound unreasonable? > > I did not phrase that well. Basically if a key is set both in the dpdk > Extra string and as an explicit value I need to know which one reflects the > Actual runtime configuration of ovs. E.g. always prefer the explicit value > Or always prefer the dpdk_extra value. Hopefully, again, this is clear now. dpdk-extra is preferred values. >> >> >> >> >> > There's one or two of the db entries that may be able to reused >> >> > later for other things e.g. vhostuser socket location, so that >> >> > would be a + >> >> for them. >> >> > Backwards compatibility would be a + for command line args. Daniele >> >> > has mentioned scripting also. I'm sure there's other +/-'s. >> >> >> >> I don't know - scripting vswitchd? I think that sounds a little >> >> strange; >> > It depend on your usecase. I am a developer on neutron integrating ovs >> > with dpkd with Openstack. I also maintain a devstack plugin to deploy >> > ovs with dpdk from source. >> > >> > In keeping with other service in devstack we run the ovsvswitchd in a >> > screen session. >> > https://github.com/openstack/networking-ovs-dpdk/blob/master/devstack/ >> > ovs-dpdk/ovs-dpdk-init#L528 we heavily script the compilation and >> > deployment of ovs and treat it like any other ephemeral service. >> > Our ovs-dpdk service file is 700 lines long and we have separate logic >> > to compile and install it. >> > >> > In production sure it won't be started and stopped frequently but when >> > used in a development environment It can have a very sort lifetime. >> >> Wow. That sounds kindof horrible, actually. There's a script included >> with OVS which compiles and starts OVS (see utilities/ovs-dev.py). I >> think that should be preferred, and will let you specify the arguments >> you wish, I think. > > Actually our code to build and install ovs is only about 30 lines so the > utilities/ovs-dev.py script would not dramatically reduces our maintenance > burden in that respect but I will look into it. the original deployment code > we have predates dpdk support in ovs and utilities/ovs-dev.py by several > months > as we used it internally to deploy ovs-dpdk when dpdk support was being added. > > A large portion of the complexity in those 700 lines exist because of how > dpdk physical port have to be added to ovs. Adding dpdk phy ports is my > number 1 useablity issue with ovs-dpdk. > > The current condensed workflow is as follows > > Map netdev names to pci address for each interface > Bind interface to dpdk driver > Generate ovs port name for each netdev as dpdkX where X is the index into the > array > Of dpdk ports sorted by pciaddress > Start ovsdb > Create bridges,set datapath=netdev and add port with --no-wait as > ovsvswitchd is not running. > And finally start the ovsvswitchd with the dpdk interface in the eal pci > whitelist. > Note: the whitelist is need to ensure that the array of dpdk pmds > create in the eal init is the > Same as the one we used earlier to calculate the dpdkX names Hrrm... maybe there's an additional convenience that can be provided here. That's a different feature altogether, though :) > If we could simply add an dpdk phy port while the ovsvswitchd is running > By simply using the following workflow it would improve the usability > greatly. Stay tuned - I think dpdk will support this in either 16.04 or 16.07 or something like that :) It's being worked on, definitely. > Start ovsdb and ovsvswitchd normally > Bind interface to dpdk driver then run > ovs-vsctl add-port <bridge> <arbitrary port name> -- set interface > <arbitrary port name> type=dpdk other-config="pci-address=xyz" > > or if I was even bolder just > Start ovsdb and ovsvswitchd normally > ovs-vsctl add-port <bridge> <arbitrary port name> -- set interface > <arbitrary port name> type=dpdk > other-config="pci-address=xyz,driver=igb_uio,olddriver=ixgbe" > > though I understand why ovs does not want to manage the > bining/unbining of the interface to the dpdk driver but that is > the hardest part to script correctly. There is driverctl being added. Not sure if it helps or not. > Sorry for the mini rant but I like the direction that this patch chain > is takeing ovs configuration > It will improve usability with dpdk a lot. The physical port issue is > just one I have been dealing with > Since before ovs had support for dpdk in tree. Okay, cool. I'll think about anything that can be done from OVS+DPDK side to help out with this (if there even is anything). >> >> Regardless, sure that script would have to change a bit, but I think >> you'll want to change it anyway (there's a new user feature in OVS, plus >> I'm adding support for vhost-user socket group ids, so the 'sg' will not >> be the advised way of starting ovs-vswitchd, for instance). There have > > > That really good to hear as that was an annoying pain point to work around. > Will this by default allow read write access to the same group? If not > I would still need to change the umask to 002. It would be nice to be able > To specify the vhost-user socket permissions also in the event that the group > Will not have read write access by default. > >> been some new feature developments with OVS that you could leverage to >> get the same effect with less commands. I would suggest just changing >> L511 to do the db sets there. It has the added effect of shrinking L520 >> and L528. > > Yes this is what I had in mind. We would simply set these values when we > Create the ovsdb. The actually changes required on my side are minimal. > >> >> Not to mention -vhost_sock_dir is not actually an EAL argument (which is >> why I'm changing that whole infrastructure). If that got passed to dpdk >> directly, it would not know what to do with it. >> >> Anyway, I realize that there could be some breakage, and would happily >> work with you to resolve it :) >> > > > Thanks I think the changes should be easy enough. > The only tricky bit is ensuring I can deploy ovs commits before and after this > Change with the same init script. I am thinking of just greping the > vswitch.ovsschema > File for the new dpdk_init value and then fall back to the old logic > if it was not found and use the new logic if it was. > > >> >> it isn't some kind of ephemeral service that comes and goes. And it's >> >> not like this patch prevents the same kinds of arbitrary commands to >> >> be passed to the EAL (since 4/5 does precisely that). The only change >> >> required is doing ovs-vsctl before ovs-vswitch in the 'starting >> >> vswitchd' case. Is that really a huge deal? >> >> >> >> There's always pros and cons. I haven't heard any explict NAK, or any >> >> explicit ACK. It would be nice for that to happen, since I can't >> >> maintain this series out-of-tree forever, and there are other things >> >> I'd really like to get to (as fun as db entries may be). >> >> >> >> Any suggestions on how to move forward? I was planning on posting a >> >> rebased v9 - should I still do that? >> >> >> >> -Aaron >> >> >> >> > Kevin. >> >> > >> >> >> >> >> >> Aaron Conole (5): >> >> >> netdev-dpdk: Restore thread affinity after DPDK init >> >> >> netdev-dpdk: Convert initialization from cmdline to db >> >> >> netdev-dpdk: Autofill lcore coremask if absent >> >> >> netdev-dpdk: Allow arbitrary eal arguments >> >> >> NEWS: Announce the DPDK EAL configuration change >> >> >> >> >> >> FAQ.md | 6 +- >> >> >> INSTALL.DPDK.md | 90 ++++++++++--- >> >> >> NEWS | 5 + >> >> >> lib/netdev-dpdk.c | 327 >> >> ++++++++++++++++++++++++++++++++++++++----- >> >> >> -- >> >> >> lib/netdev-dpdk.h | 22 ++- >> >> >> utilities/ovs-dev.py | 7 +- >> >> >> vswitchd/bridge.c | 3 + >> >> >> vswitchd/ovs-vswitchd.8.in | 5 +- >> >> >> vswitchd/ovs-vswitchd.c | 25 +--- >> >> >> vswitchd/vswitch.xml | 128 +++++++++++++++++- >> >> >> 10 files changed, 513 insertions(+), 105 deletions(-) >> >> >> >> >> >> -- >> >> >> 2.5.0 >> >> > >> >> > _______________________________________________ >> >> > dev mailing list >> >> > dev@openvswitch.org >> >> > http://openvswitch.org/mailman/listinfo/dev > _______________________________________________ > dev mailing list > dev@openvswitch.org > http://openvswitch.org/mailman/listinfo/dev _______________________________________________ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev