I agree with Brian here - the thermos CLI is equivalent to a read-write observer. I wonder if we can change this into a discussion about features - what features of the CLI are necessary and which are superfluous?
On Mon, Feb 23, 2015 at 3:33 PM, Brian Wickman <wick...@apache.org> wrote: > I don't see how that follows. > > Are you saying that if Aurora can start scheduling Hadoop containers, > people can't count on their Hadoop tools for debugging? > > On Mon, Feb 23, 2015 at 2:02 PM, Bill Farner <wfar...@apache.org> wrote: > > > Perhaps i was unclear - i was saying that the thermos CLI cannot be > counted > > on as a debugging tool when other executors are in play. > > > > -=Bill > > > > On Mon, Feb 23, 2015 at 1:12 PM, Joseph Smith <yasumo...@gmail.com> > wrote: > > > > > I actually find that the observer fits this usecase just as well, and > is > > a > > > better interface since the scheduler gives users a link to it on each > > host > > > as well. > > > > > > I’m looking to decrease build complexity, and removing this would be a > > > great victory for that. > > > > > > > On Feb 18, 2015, at 12:56 PM, Maxim Khutornenko <ma...@apache.org> > > > wrote: > > > > > > > > Running "thermos status --verbosity=3" gives full thermos task > history > > > > including the sandbox path and process table contents. This really > > > > saves time when trying to get to the failed task details or see what > > > > else is running on a host. > > > > > > > > On Wed, Feb 18, 2015 at 12:04 PM, Bill Farner <wfar...@apache.org> > > > wrote: > > > >> Can either of you elaborate on the type of debugging you currently > > > >> accomplish with this tool? > > > >> > > > >> On Wednesday, February 18, 2015, Brian Wickman <wick...@apache.org> > > > wrote: > > > >> > > > >>> I agree it is a valuable component. However, I think that until it > > has > > > >>> test coverage we should consider it an unsupported tool. Filed > > > AURORA-1131 > > > >>> <https://issues.apache.org/jira/browse/AURORA-1131>. This is > > already > > > on > > > >>> my > > > >>> radar as part of AURORA-1027 > > > >>> <https://issues.apache.org/jira/browse/AURORA-1027>. > > > >>> > > > >>> On Wed, Feb 18, 2015 at 9:19 AM, Maxim Khutornenko < > ma...@apache.org > > > >>> <javascript:;>> wrote: > > > >>> > > > >>>>> Moving parts should either provide value or be obliterated from > our > > > >>>> source tree. > > > >>>> > > > >>>> I generally agree. In this particular case it's still unclear to > me > > - > > > >>>> in the absence of Thermos CLI and Observer, how do we conduct live > > > >>>> site executor/thermos troubleshooting? > > > >>>> > > > >>>> On Tue, Feb 17, 2015 at 7:45 PM, Bill Farner <wfar...@apache.org > > > >>> <javascript:;>> wrote: > > > >>>>>> > > > >>>>>> I think we would be better served by advertising it as an > > > >>>>>> optional component that provides operators and users with > > debugging > > > >>>>>> ability. > > > >>>>> > > > >>>>> > > > >>>>> Slightly tangential discussion, but i think we should be very > > > skeptical > > > >>>> of > > > >>>>> fringe components. Moving parts should either provide value or > be > > > >>>>> obliterated from our source tree. > > > >>>>> > > > >>>>> -=Bill > > > >>>>> > > > >>>>> On Tue, Feb 17, 2015 at 6:51 PM, Zameer Manji <zma...@apache.org > > > >>> <javascript:;>> wrote: > > > >>>>> > > > >>>>>> One thing I would like to point out is the thermos CLI is not > > > required > > > >>>> for > > > >>>>>> Aurora operation. I think we would be better served by > advertising > > > it > > > >>>> as an > > > >>>>>> optional component that provides operators and users with > > debugging > > > >>>>>> ability. > > > >>>>>> > > > >>>>>> On Tue, Feb 17, 2015 at 6:38 PM, Joseph Smith < > > yasumo...@gmail.com > > > >>> <javascript:;>> > > > >>>> wrote: > > > >>>>>> > > > >>>>>>> I believe it absolutely is- ideally as we deprecate the > Observer, > > > we > > > >>>> can > > > >>>>>>> then lean on the Mesos Slave for this information instead. This > > > will > > > >>>>>>> further decrease the number of moving pieces, simplifying the > > > >>>> operation > > > >>>>>> of > > > >>>>>>> an Aurora/Mesos cluster. > > > >>>>>>> > > > >>>>>>>> On Feb 17, 2015, at 6:33 PM, Zameer Manji <zma...@apache.org > > > >>> <javascript:;>> > > > >>>> wrote: > > > >>>>>>>> > > > >>>>>>>> Joe, > > > >>>>>>>> > > > >>>>>>>> If I understand Brian's proposal correctly < > > > >>>>>>>> > > > >>>>>>> > > > >>>>>> > > > >>>> > > > >>> > > > > > > http://mail-archives.apache.org/mod_mbox/aurora-dev/201501.mbox/%3CCAFTdr0DZvH21tR=NLK0qP-Y9-oL9SyULy6GLah=capuw0sv...@mail.gmail.com%3E > > > >>>>>>>> , > > > >>>>>>>> we are going to depreciate the Observer. This combined with > your > > > >>>>>> proposal > > > >>>>>>>> will make the executor the only component that can read the > > > >>> thermos > > > >>>>>>>> checkpoints and produce some output that is human readable. Is > > > >>> that > > > >>>>>>>> something we want to do? > > > >>>>>>>> > > > >>>>>>>> On Tue, Feb 17, 2015 at 6:26 PM, Joseph Smith < > > > >>> yasumo...@gmail.com <javascript:;>> > > > >>>>>>> wrote: > > > >>>>>>>> > > > >>>>>>>>> Hi everyone, > > > >>>>>>>>> > > > >>>>>>>>> After reviewing the functionality offered by the Thermos > > > >>>> Commandline > > > >>>>>>> tool > > > >>>>>>>>> vs. what’s exported via the Thermos Observer, I was hoping to > > > >>> bring > > > >>>>>> up a > > > >>>>>>>>> question I had: > > > >>>>>>>>> > > > >>>>>>>>> Can we deprecate the Thermos CLI? > > > >>>>>>>>> > > > >>>>>>>>> Removing this would decrease the number of components > required > > > >>> for > > > >>>> a > > > >>>>>>>>> functional Aurora installation (a huge victory, in my > opinion) > > > >>> and > > > >>>>>> also > > > >>>>>>>>> enable the Observer to fully take over the duty of providing > > > >>>>>> visibility > > > >>>>>>>>> into what’s running on a most. In addition, maintenance is > > > >>>> performed > > > >>>>>> via > > > >>>>>>>>> the HostMaintenance API < > > > >>>>>>>>> > > > >>>>>>> > > > >>>>>> > > > >>>> > > > >>> > > > > > > https://github.com/apache/incubator-aurora/blob/master/src/main/python/apache/aurora/admin/host_maintenance.py#L26 > > > >>>>>>>> > > > >>>>>>>>> and should not be done using thermos kill, which would cause > > LOST > > > >>>>>> tasks. > > > >>>>>>>>> > > > >>>>>>>>> That said, removing this tool makes it much more difficult > for > > > >>>> Thermos > > > >>>>>>> to > > > >>>>>>>>> be used as a monit <http://mmonit.com/monit/> replacement, > > which > > > >>>> is > > > >>>>>>>>> actually rather feasible now. In addition, it also forces > > people > > > >>> to > > > >>>>>>>>> remember + learn the port the Observer is running on in order > > to > > > >>>> get > > > >>>>>>>>> information about tasks. > > > >>>>>>>>> > > > >>>>>>>>> Any thoughts and opinions would be much appreciated! > > > >>>>>>>>> > > > >>>>>>>>> Thanks! > > > >>>>>>>>> Joe > > > >>>>>>>>> > > > >>>>>>>>> -- > > > >>>>>>>>> Zameer Manji > > > >>>>>>>>> > > > >>>>>>>>> > > > >>>>>>> > > > >>>>>>> -- > > > >>>>>>> Zameer Manji > > > >>>>>>> > > > >>>>>>> > > > >>>>>> > > > >>>> > > > >>> > > > >> > > > >> > > > >> -- > > > >> -=Bill > > > > > > > > >