> On July 29, 2020, 11:27 p.m., Qian Zhang wrote: > > src/slave/csi_server.cpp > > Lines 233-235 (patched) > > <https://reviews.apache.org/r/72716/diff/2/?file=2236468#file2236468line233> > > > > I am just curious what would happen if any of the initialization logic > > fail, how will the failure be propogated back? > > Greg Mann wrote: > I updated the server so that now `start()` returns a future associated > with the initialization. > > Qian Zhang wrote: > I see. And I guess `CSIServer::start()` will be called in > `Slave::registered` and `Slave::reregistered`, right? I am just wondering how > we are going to handle the returned future there. Are we going to register an > `onAny` callback and log an error message if it is a failed future? > > Greg Mann wrote: > Yea I think we have to decide how to handle failures of CSI server > initialization. I might propose a timeout in the agent, after which we log an > error? And we could provide a task status message perhaps when task launches > fail because the CSI server failed to initialize? > > In any case, I think the interface offered by the current patch set will > be sufficient to let us handle the failed initialization case, WDYT? > > Qian Zhang wrote: > I took a look at the code of local resource provider daemon and I found > it just log an error message in its `start` method: > > https://github.com/apache/mesos/blob/1.10.0/src/slave/slave.cpp#L1740:L1742 > > https://github.com/apache/mesos/blob/1.10.0/src/resource_provider/daemon.cpp#L188:L191 > > Do you think if we can do the similar? > > Greg Mann wrote: > I was worried about maintaining ordering of calls made before startup, > but as we discussed offline, I think I was being too paranoid :) > > I'm switching the patch to the kind of approach used in the daemon. > > Greg Mann wrote: > lol sorry I replied on the wrong issue :) > > I think we can log an error message as well, but it does also seem > appropriate to me to return a future associated with the startup. We can > decide what to do with that future in the agent, but might as well return it? > > Greg Mann wrote: > I decided not to log an error in the server, I think the proper place to > log would be at the callsite where we handle the future returned by > `start()`, WDYT? > > Qian Zhang wrote: > So do you mean we should log an error in `Slave::registered` and > `Slave::reregistered`? > > Greg Mann wrote: > It depends on how exactly we initialize the CSI server in the agent; I'm > not sure that it would necessarily be in those functions. In any case, since > `start()` returns a Future I don't think we need to log anything here in the > server, unless we think there are multiple classes of failures, some of which > we would want to crash for and some of which we would want to just log?
IMHO we need to call `CSIServer::start` in `Slave::registered` and `Slave::reregistered` (i.e. right after the agent's state is changed to `RUNNING`), do we have any other options? - Qian ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/72716/#review221405 ----------------------------------------------------------- On Aug. 7, 2020, 3 p.m., Greg Mann wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/72716/ > ----------------------------------------------------------- > > (Updated Aug. 7, 2020, 3 p.m.) > > > Review request for mesos, Andrei Budnik and Qian Zhang. > > > Bugs: MESOS-10163 > https://issues.apache.org/jira/browse/MESOS-10163 > > > Repository: mesos > > > Description > ------- > > Added implementation of the CSI server. > > > Diffs > ----- > > src/CMakeLists.txt 4e15e3d99aa2cce2403fe07e762fef2fb4a27dea > src/Makefile.am 447db323875e4cad46000977f4a61600baff8f89 > src/slave/csi_server.hpp 17882e1be5a6c38ca34d7b50d4a6041530e8908c > src/slave/csi_server.cpp PRE-CREATION > > > Diff: https://reviews.apache.org/r/72716/diff/5/ > > > Testing > ------- > > Details at the end of this chain. > > > Thanks, > > Greg Mann > >
