On Wed, Aug 30, 2017 at 06:11:47AM +0000, Matan Azrad wrote: > Hi Gaetan > > > -----Original Message----- > > From: Gaëtan Rivet [mailto:gaetan.ri...@6wind.com] > > Sent: Tuesday, August 29, 2017 7:34 PM > > To: Matan Azrad <ma...@mellanox.com> > > Cc: dev@dpdk.org; Raslan Darawsheh <rasl...@mellanox.com>; > > sta...@dpdk.org > > Subject: Re: [PATCH] net/failsafe: fix exec parameter parsing error flow > > > > Hi Matan, > > > > On Tue, Aug 29, 2017 at 05:59:08PM +0300, Matan Azrad wrote: > > > The corrupted code returns success value in case of the execution > > > process output stream is empty(EOF). > > > It causes to segmentation fault while failsafe polls this command line > > > again, than gets success and tries to do hotplug add to the sub device > > > by uninitialized pointer dereferencing. > > > > > > > This is a bug and should be fixed, thanks. > >
Actually I am unable to reproduce this bug. Do you have a fail-safe command line that would showcase this behavior? > > > Morever, when the output is not empty but uncorrect, failsafe returns > > > error for its probe function while the expected behavior is to do > > > polling until the output is correct. > > > > > > > The expected behavior is for the fail-safe to return an error if the > > execution > > of the given command returns an error. > > > > The intention is that users writing such script would be able to output a > > blank > > lines in case there is nothing to probe, but still remain aware of issues > > during > > the execution of the command. > > > > The fail-safe ignores errors pertaining to absent devices due to its nature. > > This does not mean that it should ignore all errors and try to keep on going > > while everything else is on fire. > > > > The contract with the user is that "blank line" without other errors means > > "absent device". Garbled output or return code != 0 means runtime error > > and should be thrown to the user / application. > > > > OK, good, I would have signed this contract :) > > What's about if the parsing is not empty and out with error in the polling > process? > I think in current code failsafe just continues normally and tries again on > next polling time. > Because of this code I thought that if error occurs we should poll it again... > It depends whether the fail-safe has already been initialized or not. During the initialization phase, any errors other than -ENODEV means that it must stop and force the user to look into it. When initialization has finished, if polling errors occurs, the fail-safe will try to minimize service disruption to the potentially existing sub-devices. It thus discards the error and will try again later. > Can you please add it (the contract) in failsafe documentation for exec > parameter? > > > > The fix changes the return value to be -ENODEV for this sub device in > > > the two cases. > > > By this way, failsafe tries to parse this sub device parameter by exec > > > method until the output is correct. > > > > > > > The issue is that this portion of the code will be heavily modified anyway. > > The > > errno handling is erroneous and must be fixed, which is in conflict with > > your > > patch. > > > > I will send the intended fix shortly, referencing this patch and the issue > > your > > highlighted, but both patch won't be compatible. > > > > Good, no problems. > > > > Fixes: a0194d828100 ("net/failsafe: add flexible device definition") > > > Fixes: 35ffe4208140 ("net/failsafe: fix missing pclose after popen") > > > Cc: sta...@dpdk.org > > > > > > Signed-off-by: Matan Azrad <ma...@mellanox.com> > > > --- > > > drivers/net/failsafe/failsafe_args.c | 6 +++++- > > > 1 file changed, 5 insertions(+), 1 deletion(-) > > > > > > diff --git a/drivers/net/failsafe/failsafe_args.c > > > b/drivers/net/failsafe/failsafe_args.c > > > index 645c885..61c55df 100644 > > > --- a/drivers/net/failsafe/failsafe_args.c > > > +++ b/drivers/net/failsafe/failsafe_args.c > > > @@ -157,12 +157,16 @@ fs_execute_cmd(struct sub_device *sdev, char > > *cmdline) > > > ret = fs_parse_device(sdev, output); > > > if (ret) { > > > ERROR("Parsing device '%s' failed", output); > > > + ret = -ENODEV; > > Remove the above line for probe function error report. > > > > goto ret_pclose; > > > } > > > ret_pclose: > > > pclose_ret = pclose(fp); > > > if (pclose_ret) { > > > - pclose_ret = errno; > > > + if (errno == 0) > > > + errno = -(pclose_ret = ret); > > > + else > > > + pclose_ret = errno; > > > ERROR("pclose: %s", strerror(errno)); > > > errno = old_err; > > > return pclose_ret; > > > -- > > > 2.7.4 > > > > > > > Best regards, > > -- > > Gaëtan Rivet > > 6WIND > > Thanks, > Matan Azrad -- Gaëtan Rivet 6WIND