I actually find that the observer fits this usecase just as well, and is a better interface since the scheduler gives users a link to it on each host as well.
I’m looking to decrease build complexity, and removing this would be a great victory for that. > On Feb 18, 2015, at 12:56 PM, Maxim Khutornenko <ma...@apache.org> wrote: > > Running "thermos status --verbosity=3" gives full thermos task history > including the sandbox path and process table contents. This really > saves time when trying to get to the failed task details or see what > else is running on a host. > > On Wed, Feb 18, 2015 at 12:04 PM, Bill Farner <wfar...@apache.org> wrote: >> Can either of you elaborate on the type of debugging you currently >> accomplish with this tool? >> >> On Wednesday, February 18, 2015, Brian Wickman <wick...@apache.org> wrote: >> >>> I agree it is a valuable component. However, I think that until it has >>> test coverage we should consider it an unsupported tool. Filed AURORA-1131 >>> <https://issues.apache.org/jira/browse/AURORA-1131>. This is already on >>> my >>> radar as part of AURORA-1027 >>> <https://issues.apache.org/jira/browse/AURORA-1027>. >>> >>> On Wed, Feb 18, 2015 at 9:19 AM, Maxim Khutornenko <ma...@apache.org >>> <javascript:;>> wrote: >>> >>>>> Moving parts should either provide value or be obliterated from our >>>> source tree. >>>> >>>> I generally agree. In this particular case it's still unclear to me - >>>> in the absence of Thermos CLI and Observer, how do we conduct live >>>> site executor/thermos troubleshooting? >>>> >>>> On Tue, Feb 17, 2015 at 7:45 PM, Bill Farner <wfar...@apache.org >>> <javascript:;>> wrote: >>>>>> >>>>>> I think we would be better served by advertising it as an >>>>>> optional component that provides operators and users with debugging >>>>>> ability. >>>>> >>>>> >>>>> Slightly tangential discussion, but i think we should be very skeptical >>>> of >>>>> fringe components. Moving parts should either provide value or be >>>>> obliterated from our source tree. >>>>> >>>>> -=Bill >>>>> >>>>> On Tue, Feb 17, 2015 at 6:51 PM, Zameer Manji <zma...@apache.org >>> <javascript:;>> wrote: >>>>> >>>>>> One thing I would like to point out is the thermos CLI is not required >>>> for >>>>>> Aurora operation. I think we would be better served by advertising it >>>> as an >>>>>> optional component that provides operators and users with debugging >>>>>> ability. >>>>>> >>>>>> On Tue, Feb 17, 2015 at 6:38 PM, Joseph Smith <yasumo...@gmail.com >>> <javascript:;>> >>>> wrote: >>>>>> >>>>>>> I believe it absolutely is- ideally as we deprecate the Observer, we >>>> can >>>>>>> then lean on the Mesos Slave for this information instead. This will >>>>>>> further decrease the number of moving pieces, simplifying the >>>> operation >>>>>> of >>>>>>> an Aurora/Mesos cluster. >>>>>>> >>>>>>>> On Feb 17, 2015, at 6:33 PM, Zameer Manji <zma...@apache.org >>> <javascript:;>> >>>> wrote: >>>>>>>> >>>>>>>> Joe, >>>>>>>> >>>>>>>> If I understand Brian's proposal correctly < >>>>>>>> >>>>>>> >>>>>> >>>> >>> http://mail-archives.apache.org/mod_mbox/aurora-dev/201501.mbox/%3CCAFTdr0DZvH21tR=NLK0qP-Y9-oL9SyULy6GLah=capuw0sv...@mail.gmail.com%3E >>>>>>>> , >>>>>>>> we are going to depreciate the Observer. This combined with your >>>>>> proposal >>>>>>>> will make the executor the only component that can read the >>> thermos >>>>>>>> checkpoints and produce some output that is human readable. Is >>> that >>>>>>>> something we want to do? >>>>>>>> >>>>>>>> On Tue, Feb 17, 2015 at 6:26 PM, Joseph Smith < >>> yasumo...@gmail.com <javascript:;>> >>>>>>> wrote: >>>>>>>> >>>>>>>>> Hi everyone, >>>>>>>>> >>>>>>>>> After reviewing the functionality offered by the Thermos >>>> Commandline >>>>>>> tool >>>>>>>>> vs. what’s exported via the Thermos Observer, I was hoping to >>> bring >>>>>> up a >>>>>>>>> question I had: >>>>>>>>> >>>>>>>>> Can we deprecate the Thermos CLI? >>>>>>>>> >>>>>>>>> Removing this would decrease the number of components required >>> for >>>> a >>>>>>>>> functional Aurora installation (a huge victory, in my opinion) >>> and >>>>>> also >>>>>>>>> enable the Observer to fully take over the duty of providing >>>>>> visibility >>>>>>>>> into what’s running on a most. In addition, maintenance is >>>> performed >>>>>> via >>>>>>>>> the HostMaintenance API < >>>>>>>>> >>>>>>> >>>>>> >>>> >>> https://github.com/apache/incubator-aurora/blob/master/src/main/python/apache/aurora/admin/host_maintenance.py#L26 >>>>>>>> >>>>>>>>> and should not be done using thermos kill, which would cause LOST >>>>>> tasks. >>>>>>>>> >>>>>>>>> That said, removing this tool makes it much more difficult for >>>> Thermos >>>>>>> to >>>>>>>>> be used as a monit <http://mmonit.com/monit/> replacement, which >>>> is >>>>>>>>> actually rather feasible now. In addition, it also forces people >>> to >>>>>>>>> remember + learn the port the Observer is running on in order to >>>> get >>>>>>>>> information about tasks. >>>>>>>>> >>>>>>>>> Any thoughts and opinions would be much appreciated! >>>>>>>>> >>>>>>>>> Thanks! >>>>>>>>> Joe >>>>>>>>> >>>>>>>>> -- >>>>>>>>> Zameer Manji >>>>>>>>> >>>>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Zameer Manji >>>>>>> >>>>>>> >>>>>> >>>> >>> >> >> >> -- >> -=Bill