On Tue, Jun 16, 2020 at 3:01 PM Burakov, Anatoly <anatoly.bura...@intel.com> wrote: > > On 15-Jun-20 5:29 PM, Jerin Jacob wrote: > > On Mon, Jun 15, 2020 at 9:15 PM Burakov, Anatoly > > <anatoly.bura...@intel.com> wrote: > >> > >> On 15-Jun-20 4:21 PM, Jerin Jacob wrote: > >>> On Mon, Jun 15, 2020 at 8:35 PM Burakov, Anatoly > >>> <anatoly.bura...@intel.com> wrote: > >>>> > >>>> On 15-Jun-20 12:43 PM, Jerin Jacob wrote: > >>>>> On Mon, Jun 15, 2020 at 5:02 PM Burakov, Anatoly > >>>>> <anatoly.bura...@intel.com> wrote: > >>>>>> > >>>>>> On 02-Jun-20 1:16 PM, Harman Kalra wrote: > >>>>>>> On Tue, Jun 02, 2020 at 03:53:07PM +0530, Harman Kalra wrote: > >>>>>>>> On Mon, Jun 01, 2020 at 01:50:26PM +0100, Burakov, Anatoly wrote: > >>>>>>>>> On 30-May-20 11:02 AM, Harman Kalra wrote: > >>>>>>>>>> On Fri, May 29, 2020 at 03:19:45PM +0100, Burakov, Anatoly wrote: > >>>>>>>>>>> External Email > >>>>>>>>>>> > >>>>>>>>>>> ---------------------------------------------------------------------- > >>>>>>>>>>> On 29-May-20 2:19 PM, Harman Kalra wrote: > >>>>>>>>>>> > >>>>>>>>>>>>> if (ret < 0) > >>>>>>>>>>>>> rte_exit(EXIT_FAILURE, "Invalid L3FWD > >>>>>>>>>>>>> parameters\n"); > >>>>>>>>>>>>> - if (app_mode != APP_MODE_TELEMETRY && > >>>>>>>>>>>>> init_power_library()) > >>>>>>>>>>>>> + if (app_mode == APP_MODE_DEFAULT) > >>>>>>>>>>>>> + app_mode = APP_MODE_LEGACY; > >>>>>>>>>>>>> + > >>>>>>>>>>>>> + /* only legacy and empty poll mode rely on power > >>>>>>>>>>>>> library */ > >>>>>>>>>>>>> + if ((app_mode == APP_MODE_LEGACY || app_mode == > >>>>>>>>>>>>> APP_MODE_EMPTY_POLL) && > >>>>>>>>>>>>> + init_power_library()) > >>>>>>>>>>>>> rte_exit(EXIT_FAILURE, "init_power_library > >>>>>>>>>>>>> failed\n"); > >>>>>>>>>>>> Hi, > >>>>>>>>>>>> > >>>>>>>>>>>> Rather than just exiting from here can we have a else condition > >>>>>>>>>>>> to > >>>>>>>>>>>> automatically enter into the "interrupt only" mode. > >>>>>>>>>>>> Please correct me if I am missing something. > >>>>>>>>>>> > >>>>>>>>>>> Hi, > >>>>>>>>>>> > >>>>>>>>>>> Thanks for your review. I don't think silently proceeding is a > >>>>>>>>>>> good idea. If > >>>>>>>>>>> the user wants interrupt-only mode, they should request it. > >>>>>>>>>>> Silently falling > >>>>>>>>>>> back to interrupt-only mode will create an illusion of successful > >>>>>>>>>>> initialization and set the wrong expectation for how the > >>>>>>>>>>> application will > >>>>>>>>>>> behave. > >>>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> Hi, > >>>>>>>>>> > >>>>>>>>>> Thanks for the explanation which even I also believe is logically > >>>>>>>>>> perfect. > >>>>>>>>>> > >>>>>>>>>> But since l3fwd-power is an old application and has many users > >>>>>>>>>> around > >>>>>>>>>> which are currently using this app in interrupt only mode without > >>>>>>>>>> giving > >>>>>>>>>> an extra argument. But suddenly they will start getting failure > >>>>>>>>>> messages with > >>>>>>>>>> the new patchset. > >>>>>>>>>> > >>>>>>>>>> My only intent with else condition was backward compatibility. > >>>>>>>>>> Or may be we can have more descriptive failure message, something > >>>>>>>>>> like > >>>>>>>>>> "init_power_library failed, check manual for other modes". > >>>>>>>>>> > >>>>>>>>>> Thanks > >>>>>>>>>> Harman > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>> > >>>>>>>>> I think we can compormise on an informative log message suggesting > >>>>>>>>> to use > >>>>>>>>> interrupt mode. I'm not keen on reverting to previous buggy > >>>>>>>>> behavior :) > >>>>>>>>> > >>>>>>>> Hi > >>>>>>>> > >>>>>>>> I am not insisting to revert to previous behavior, I am just trying > >>>>>>>> to > >>>>>>>> highlight some probable issues that many users might face as its an > >>>>>>>> old > >>>>>>>> application. > >>>>>>>> Since many arm based soc might not be supporting frequency scaling, > >>>>>>>> can > >>>>>>>> we add the following check as soon as the application starts, probe > >>>>>>>> the > >>>>>>>> platform if it supports frequency scaling, if not automatically set > >>>>>>>> the > >>>>>>>> mode to interrupt mode, something like: > >>>>>>>> if (access("/sys/devices/system/cpu/cpu0/cpufreq/scaling_governor", > >>>>>>>> F_OK)) > >>>>>>>> app_mode = APP_MODE_INTERRUPT; > >>>>>>> > >>>>>>> Sorry, no direct check in application but we can introduce a new API > >>>>>>> in > >>>>>>> power library: > >>>>>>> bool rte_is_freq_scaling() { > >>>>>>> return > >>>>>>> access("/sys/devices/system/cpu/cpu0/cpufreq/scaling_governor", > >>>>>>> F_OK); > >>>>>>> } > >>>>>>> > >>>>>>> and in the application we can use "rte_is_freq_scaling()" at the > >>>>>>> start. > >>>>>>> > >>>>>> > >>>>>> What you're suggesting here is effectively what you have already > >>>>>> suggested: silently fall back to interrupt-only mode if power lib init > >>>>>> failed. I already outlined why i don't think it's a good approach. > >>>>> > >>>>> Is probing "/sys/devices/system/cpu/cpu0/cpufreq/scaling_governor" > >>>>> file presence, > >>>>> detects the power lib availability . Right? Not the failure. Right? > >>>>> IMO, it make sense to have following case: > >>>>> # first check, Is the system is capable of power lib if so use power lib > >>>>> # if the system is not capable of using power lib use interrupt mode. > >>>>> > >>>>> I think, there is difference between system capable of using power lib > >>>>> vs power lib not available or power lib failure. > >>>> > >>>> I am of the opinion that if a test sets up an unrealistic expectation of > >>>> how an application should behave, it's a problem with the test, not with > >>>> the application. > >>>> > >>>> If the system is not capable of running with power lib - the application > >>>> shouldn't be requested to run in such mode. > >>> > >>> But with this patch, default proposed mode is power lib without any > >>> explicit request. > >> > >> The default has always been "use the power library". That was always the > >> expected behavior, i believe since the very first version of this > >> application. In other words, even if the "requested" behavior is not > >> requested explicitly, it has always been that implicitly, and this patch > >> is *keeping existing behavior*. > > > > See below, > >> > >>> > >>>> > >>>> "The application behaved that way before" - yes, it did. It was a bug in > >>>> the application, that it allowed users to effectively misuse the > >>>> application and use it despite the fact that it was in a half-working > >>>> state. This problem has been addressed by 1) not allowing the > >>>> application to run in half-working state, and 2) adding a new mode where > >>>> the old "expected" behavior is *actually* expected and is "full working > >>>> state" now. > >>>> > >>>> Therefore, all users who were previously misusing the application to do > >>>> something it was not designed to do because of a bug in the > >>>> implementation, should now fix their usage and use the correct mode - > >>>> and such breakage is IMO necessary to call attention to earlier misuse > >>>> in the tools, and to correct this usage. > >>>> > >>>> What bothers me about your suggestion is that it is impossible to fail > >>>> the test if the wrong mode was requested (as in, if we request the > >>>> power-lib mode on a system that doesn't have freq scaling) - it instead > >>>> silently falls back to a mode that is almost guaranteed to work. > >>> > >>> I agree that it should fail, i.e someone explicitly request, > >>> power-lib mode or any mode > >>> and it should fail application if the platform we can not do that. > >>> > >>> My suggestion is all about, what is the default, IMO, if no argument > >>> is specified, > >>> the application should _probe_ the capability of the platfrom and > >>> choose the mode. One can override > >>> the probe to select the desired one. In such mode, fail to configure > >>> the mode should result in > >>> an error. > >> > >> This would change the default behavior that has always existed with this > >> application, and would still be subject to silent failure issue > >> *because* older tests may not account for this implied assumption of > >> "the application will run no matter what", leading to a possible false > >> positive test result. > >> > >> Now, if the default was "not to run and ask explicitly what mode should > >> the user use" - i can see how we could both agree on that. It's not > >> unprecedented - l3fwd itself won't run unless you explicitly specify > >> core config, so we *could* request additional parameters by default. I > >> would've also agreed with the "probe" thing *if it was a new application > >> without any pre-existing usages* - but it isn't, and in this case IMO > >> this is doubly important because there may be tests out there that *rely > >> on a bug* and thus should be explicitly broken and addressed (like the > >> internal test we had that uncovered the issue in the first place). > >> > >> In other words, in the perfect world, i would agree with you :) As it > >> stands though, i think that intentionally breaking tests that are > >> themselves broken and rely on wrong assumptions is more important than > >> keeping them working. > > > > OK. Let's enumerate the pros and cons of both approaches? > > Approach a: Auto probe > > Approach b: Current patch > > > > Approach a: > > + Application picks up the mode based on platform capability > > + No change in behavior wrt existing l3fwd-power where the platform > > has only interrupt support. > > (otherwise, It will fail to boot up, the CI etc we need to patch based > > on the DPDK version) > > > > I am not sure approach b has pros wrt approach a. > > The power library has other ways of freq scaling, not just through the > pstate/ACPI. Now, granted, the third way (KVM) isn't supposed to work on > l3fwd-power, and is in fact explicitly stopped from working in this > patch, but still - relying on the scaling governor alone is not good enough. > > Perhaps we should add a "check available" API that would probe each of > the supported modes and return 1 for (potentially) supported and 0 for > (definitely) unsupported?
+1 for adding "check available" API for the specific mode. > As far as i know, all three are reliant on > file access at the end of the day, so we can check if they exist before > trying to access them. That would effectively be "autodetect" without > being too specific to ACPI/pstate scaling checks. +1 > > > > > I.e On the x86 platform where freq scaling is present then SHOULD NOT > > have any difference in the approach a vs > > approach b. ie. Auto probe finds the system is capable of freq scaling > > and picks the powerlib. its is win-win case, > > I am not sure, What I am missing? > > > > > > I was worried that an x86 VM would still have this file without being > capable of frequency scaling, but i just checked and it seems that VM's > indeed don't expose the cpufreq filesystem. Looks good then!! > > -- > Thanks, > Anatoly