I'm saying that if Aurora launched Hadoop containes, we can no longer use the thermos CLI as view into what Aurora has scheduled on the host.
Happy to break the conversation down into one of features. -=Bill On Mon, Feb 23, 2015 at 3:39 PM, Kevin Sweeney <kevi...@apache.org> wrote: > I agree with Brian here - the thermos CLI is equivalent to a read-write > observer. I wonder if we can change this into a discussion about features - > what features of the CLI are necessary and which are superfluous? > > On Mon, Feb 23, 2015 at 3:33 PM, Brian Wickman <wick...@apache.org> wrote: > > > I don't see how that follows. > > > > Are you saying that if Aurora can start scheduling Hadoop containers, > > people can't count on their Hadoop tools for debugging? > > > > On Mon, Feb 23, 2015 at 2:02 PM, Bill Farner <wfar...@apache.org> wrote: > > > > > Perhaps i was unclear - i was saying that the thermos CLI cannot be > > counted > > > on as a debugging tool when other executors are in play. > > > > > > -=Bill > > > > > > On Mon, Feb 23, 2015 at 1:12 PM, Joseph Smith <yasumo...@gmail.com> > > wrote: > > > > > > > I actually find that the observer fits this usecase just as well, and > > is > > > a > > > > better interface since the scheduler gives users a link to it on each > > > host > > > > as well. > > > > > > > > I’m looking to decrease build complexity, and removing this would be > a > > > > great victory for that. > > > > > > > > > On Feb 18, 2015, at 12:56 PM, Maxim Khutornenko <ma...@apache.org> > > > > wrote: > > > > > > > > > > Running "thermos status --verbosity=3" gives full thermos task > > history > > > > > including the sandbox path and process table contents. This really > > > > > saves time when trying to get to the failed task details or see > what > > > > > else is running on a host. > > > > > > > > > > On Wed, Feb 18, 2015 at 12:04 PM, Bill Farner <wfar...@apache.org> > > > > wrote: > > > > >> Can either of you elaborate on the type of debugging you currently > > > > >> accomplish with this tool? > > > > >> > > > > >> On Wednesday, February 18, 2015, Brian Wickman < > wick...@apache.org> > > > > wrote: > > > > >> > > > > >>> I agree it is a valuable component. However, I think that until > it > > > has > > > > >>> test coverage we should consider it an unsupported tool. Filed > > > > AURORA-1131 > > > > >>> <https://issues.apache.org/jira/browse/AURORA-1131>. This is > > > already > > > > on > > > > >>> my > > > > >>> radar as part of AURORA-1027 > > > > >>> <https://issues.apache.org/jira/browse/AURORA-1027>. > > > > >>> > > > > >>> On Wed, Feb 18, 2015 at 9:19 AM, Maxim Khutornenko < > > ma...@apache.org > > > > >>> <javascript:;>> wrote: > > > > >>> > > > > >>>>> Moving parts should either provide value or be obliterated from > > our > > > > >>>> source tree. > > > > >>>> > > > > >>>> I generally agree. In this particular case it's still unclear to > > me > > > - > > > > >>>> in the absence of Thermos CLI and Observer, how do we conduct > live > > > > >>>> site executor/thermos troubleshooting? > > > > >>>> > > > > >>>> On Tue, Feb 17, 2015 at 7:45 PM, Bill Farner < > wfar...@apache.org > > > > >>> <javascript:;>> wrote: > > > > >>>>>> > > > > >>>>>> I think we would be better served by advertising it as an > > > > >>>>>> optional component that provides operators and users with > > > debugging > > > > >>>>>> ability. > > > > >>>>> > > > > >>>>> > > > > >>>>> Slightly tangential discussion, but i think we should be very > > > > skeptical > > > > >>>> of > > > > >>>>> fringe components. Moving parts should either provide value or > > be > > > > >>>>> obliterated from our source tree. > > > > >>>>> > > > > >>>>> -=Bill > > > > >>>>> > > > > >>>>> On Tue, Feb 17, 2015 at 6:51 PM, Zameer Manji < > zma...@apache.org > > > > >>> <javascript:;>> wrote: > > > > >>>>> > > > > >>>>>> One thing I would like to point out is the thermos CLI is not > > > > required > > > > >>>> for > > > > >>>>>> Aurora operation. I think we would be better served by > > advertising > > > > it > > > > >>>> as an > > > > >>>>>> optional component that provides operators and users with > > > debugging > > > > >>>>>> ability. > > > > >>>>>> > > > > >>>>>> On Tue, Feb 17, 2015 at 6:38 PM, Joseph Smith < > > > yasumo...@gmail.com > > > > >>> <javascript:;>> > > > > >>>> wrote: > > > > >>>>>> > > > > >>>>>>> I believe it absolutely is- ideally as we deprecate the > > Observer, > > > > we > > > > >>>> can > > > > >>>>>>> then lean on the Mesos Slave for this information instead. > This > > > > will > > > > >>>>>>> further decrease the number of moving pieces, simplifying the > > > > >>>> operation > > > > >>>>>> of > > > > >>>>>>> an Aurora/Mesos cluster. > > > > >>>>>>> > > > > >>>>>>>> On Feb 17, 2015, at 6:33 PM, Zameer Manji < > zma...@apache.org > > > > >>> <javascript:;>> > > > > >>>> wrote: > > > > >>>>>>>> > > > > >>>>>>>> Joe, > > > > >>>>>>>> > > > > >>>>>>>> If I understand Brian's proposal correctly < > > > > >>>>>>>> > > > > >>>>>>> > > > > >>>>>> > > > > >>>> > > > > >>> > > > > > > > > > > http://mail-archives.apache.org/mod_mbox/aurora-dev/201501.mbox/%3CCAFTdr0DZvH21tR=NLK0qP-Y9-oL9SyULy6GLah=capuw0sv...@mail.gmail.com%3E > > > > >>>>>>>> , > > > > >>>>>>>> we are going to depreciate the Observer. This combined with > > your > > > > >>>>>> proposal > > > > >>>>>>>> will make the executor the only component that can read the > > > > >>> thermos > > > > >>>>>>>> checkpoints and produce some output that is human readable. > Is > > > > >>> that > > > > >>>>>>>> something we want to do? > > > > >>>>>>>> > > > > >>>>>>>> On Tue, Feb 17, 2015 at 6:26 PM, Joseph Smith < > > > > >>> yasumo...@gmail.com <javascript:;>> > > > > >>>>>>> wrote: > > > > >>>>>>>> > > > > >>>>>>>>> Hi everyone, > > > > >>>>>>>>> > > > > >>>>>>>>> After reviewing the functionality offered by the Thermos > > > > >>>> Commandline > > > > >>>>>>> tool > > > > >>>>>>>>> vs. what’s exported via the Thermos Observer, I was hoping > to > > > > >>> bring > > > > >>>>>> up a > > > > >>>>>>>>> question I had: > > > > >>>>>>>>> > > > > >>>>>>>>> Can we deprecate the Thermos CLI? > > > > >>>>>>>>> > > > > >>>>>>>>> Removing this would decrease the number of components > > required > > > > >>> for > > > > >>>> a > > > > >>>>>>>>> functional Aurora installation (a huge victory, in my > > opinion) > > > > >>> and > > > > >>>>>> also > > > > >>>>>>>>> enable the Observer to fully take over the duty of > providing > > > > >>>>>> visibility > > > > >>>>>>>>> into what’s running on a most. In addition, maintenance is > > > > >>>> performed > > > > >>>>>> via > > > > >>>>>>>>> the HostMaintenance API < > > > > >>>>>>>>> > > > > >>>>>>> > > > > >>>>>> > > > > >>>> > > > > >>> > > > > > > > > > > https://github.com/apache/incubator-aurora/blob/master/src/main/python/apache/aurora/admin/host_maintenance.py#L26 > > > > >>>>>>>> > > > > >>>>>>>>> and should not be done using thermos kill, which would > cause > > > LOST > > > > >>>>>> tasks. > > > > >>>>>>>>> > > > > >>>>>>>>> That said, removing this tool makes it much more difficult > > for > > > > >>>> Thermos > > > > >>>>>>> to > > > > >>>>>>>>> be used as a monit <http://mmonit.com/monit/> replacement, > > > which > > > > >>>> is > > > > >>>>>>>>> actually rather feasible now. In addition, it also forces > > > people > > > > >>> to > > > > >>>>>>>>> remember + learn the port the Observer is running on in > order > > > to > > > > >>>> get > > > > >>>>>>>>> information about tasks. > > > > >>>>>>>>> > > > > >>>>>>>>> Any thoughts and opinions would be much appreciated! > > > > >>>>>>>>> > > > > >>>>>>>>> Thanks! > > > > >>>>>>>>> Joe > > > > >>>>>>>>> > > > > >>>>>>>>> -- > > > > >>>>>>>>> Zameer Manji > > > > >>>>>>>>> > > > > >>>>>>>>> > > > > >>>>>>> > > > > >>>>>>> -- > > > > >>>>>>> Zameer Manji > > > > >>>>>>> > > > > >>>>>>> > > > > >>>>>> > > > > >>>> > > > > >>> > > > > >> > > > > >> > > > > >> -- > > > > >> -=Bill > > > > > > > > > > > > > >