Great question, David! Aurora does indeed preserve some history, though the means is non-obvious. The management of history is mostly done in HistoryPruner [1], with command line knobs defined in AsyncModule [2]. This feature might meet some, but maybe not all of your requirements.
The class naming sent you to the obvious place: MemJobStore. As it turns out, though, that's actually only storing cron jobs (this relates to an abstraction that never really panned out). Regular jobs are translated from JobConfiguration [3] objects into independent ScheduledTasks [4] representing the instances. These tasks, in turn, are stored in MemTaskStore [5], which is agnostic to states of tasks (aside for query matching). Note: we do have interest in making the data structure arrangement more natural in AURORA-106 [6]. That said, we have kicked around the idea of exposing state mutations to an external log/queue, but our use cases so far have required stronger consistency than we felt we could achieve with that. I wouldn't turn down a discussion about if/how we approach that. I hope that answers your questions, feel free to ask follow-ups! Cheers! -=Bill [1] https://github.com/apache/incubator-aurora/blob/master/src/main/java/org/apache/aurora/scheduler/async/HistoryPruner.java [2] https://github.com/apache/incubator-aurora/blob/master/src/main/java/org/apache/aurora/scheduler/async/AsyncModule.java#L100 [3] https://github.com/apache/incubator-aurora/blob/master/src/main/thrift/org/apache/aurora/gen/api.thrift#L191-210 [4] https://github.com/apache/incubator-aurora/blob/master/src/main/thrift/org/apache/aurora/gen/api.thrift#L355-365 [5] https://github.com/apache/incubator-aurora/blob/master/src/main/java/org/apache/aurora/scheduler/storage/mem/MemTaskStore.java [6] https://issues.apache.org/jira/browse/AURORA-106 On Thu, Mar 27, 2014 at 8:53 AM, David Siegel <dsie...@knewton.com> wrote: > Hello Aurorans, > > Please enlighten me. > > I think job history is a critical feature for Aurora. > > A. Do you agree? > > B. Is this feature secretly already in Aurora? > > C. If not, is this on your roadmap? > > D. Would you be interested in a patch or patches that adds job history to > Aurora? > > Below I discuss why I think this is an important feature and some thoughts > on an implementation. > > Job history has a number of uses: > > 1. Debugging production issues after the job has been updated. I may need > to know the exact configuration of a system at a previous point in time in > order to debug an issue. > > 2. Rolling back to a previous job configuration after a bad release. > > How I think Aurora works: > > As far as I can tell from the Aurora source, job history is discarded. The > MemJobStore replaces Job entries when a job is updated, so you lose the old > Job configuration. The log is truncated every time a Snapshot is taken and > the snapshots do not contain job history. > > This seems like a sound decision given that the job history will grow > forever, but means there's no history we can really audit. > > How job history might work: > > Instead of building job history into the scheduler one might write an > independent process that consumed the logs generated by the scheduler and > built up a database of job history information. It would then provide a > REST interface for querying the job history. This would keep the scheduler > free from dealing with job history. > > Any feedback is appreciated. Thanks. > > -David Siegel >