Re: monitoring aurora scheduler

2014-10-01 Thread Isaac Councill
Thanks! Comment dropped on AURORA-634. As for the error I encountered, I saw "Storage is not READY" exceptions on all scheduler instances, and no leader was elected. Nothing other than that jumped out as unusual in the logs - no ZK_* warnings/errors etc. Aurora came up before zookeeper, but auror

Re: monitoring aurora scheduler

2014-10-01 Thread Bill Farner
Ok, when you have bandwidth to upgrade again feel free to let us know if you would like somebody standing by in IRC to assist. -=Bill On Wed, Oct 1, 2014 at 11:04 AM, Isaac Councill wrote: > Thanks! Comment dropped on AURORA-634. > > As for the error I encountered, I saw "Storage is not READY"

Re: monitoring aurora scheduler

2014-10-01 Thread Isaac Councill
Much appreciated. On Wed, Oct 1, 2014 at 2:11 PM, Bill Farner wrote: > Ok, when you have bandwidth to upgrade again feel free to let us know if > you would like somebody standing by in IRC to assist. > > -=Bill > > On Wed, Oct 1, 2014 at 11:04 AM, Isaac Councill wrote: > > > Thanks! Comment dro

Build failed in Jenkins: Aurora #608

2014-10-01 Thread Apache Jenkins Server
See Changes: [wfarner] Add a monitoring guide. -- [...truncated 247 lines...] twitter.common.python.http: Crawling /home/jenkins/.pex/build: 0.6ms twitter.common.python.http: Crawling https://pypi.python.

Re: Build failed in Jenkins: Aurora #608

2014-10-01 Thread Zameer Manji
Yet another build failure because pants is unable to download a package within a deadline. What can we do to make this less flakey? On Wed, Oct 1, 2014 at 12:38 PM, Apache Jenkins Server < jenk...@builds.apache.org> wrote: > See > > Changes: > >

Re: Build failed in Jenkins: Aurora #608

2014-10-01 Thread John Sirois
In particular its a connect timeout. You use: $ ./pants src/test/python:all -vxs You could try: $ ./pants build --timeout=[timeout secs] src/test/python:all -vxs After that you have setting up & using a local pypi mirror and contributing to pants / pex to support retries - whether that's filing

Monitoring capacity headroom

2014-10-01 Thread Josh Adams
Hi there, Is there a way to ask the Scheduler for a current headroom estimate (in number of potential task "slots" available) for a given TaskConfig? I understand that I could do this manually by processing /state.json on the Mesos master and joining that with info from a getJobsResult, but I'd r

Re: Monitoring capacity headroom

2014-10-01 Thread Kevin Sweeney
There is (currently undocumented) under the scheduler's /vars or /vars.json endpoint. empty_slots_* See https://github.com/apache/incubator-aurora/blob/master/src/main/java/org/apache/aurora/scheduler/stats/SlotSizeCounter.java The constants are defined here: https://github.com/apache/incubator-

Re: Monitoring capacity headroom

2014-10-01 Thread Josh Adams
Hey Kevin, thanks for the fast reply. Currently we're using scheduling constraints for dedicated resources (which we'd like to convert to the actual "dedicated" resource feature once it's documented). These available slot numbers are useful at a high level but unfortunately not detailed enough for

Re: Monitoring capacity headroom

2014-10-01 Thread Kevin Sweeney
Hi Josh, Those numbers are only rough estimates, based on ideal T-shirt-sized instances that can be placed anywhere in the cluster. In practice we've found this to be sufficient for high-level monitoring, combined with the error messages the scheduler generates for pending tasks of individual serv

Re: Monitoring capacity headroom

2014-10-01 Thread Josh Adams
Yep, makes sense. We will probably be able to rely on these coarse metrics once we homogenize more of our task and slave configurations. We'll definitely JIRA if that doesn't work out though! Best, Josh On Wed, Oct 1, 2014 at 3:04 PM, Kevin Sweeney wrote: > Hi Josh, > > Those numbers are only r

Proposal: remove v1 client as part of 0.6.0 release

2014-10-01 Thread Bill Farner
I've just created AURORA-775 [1] to track removal of the v1 client build, as a precursor to deleting the code. Given that we are close to a release, does anyone have an opinion on whether this should happen before the 0.6.0 release? I think the impact should be minimal, this will be more about en

Re: Proposal: remove v1 client as part of 0.6.0 release

2014-10-01 Thread Kevin Sweeney
+1 to complete removal On Wed, Oct 1, 2014 at 3:26 PM, Bill Farner wrote: > I've just created AURORA-775 [1] to track removal of the v1 client build, > as a precursor to deleting the code. Given that we are close to a release, > does anyone have an opinion on whether this should happen before t

Re: Proposal: remove v1 client as part of 0.6.0 release

2014-10-01 Thread David McLaughlin
+1 to complete removal. The plan to phase out the client also sounds good to me. On Wed, Oct 1, 2014 at 3:38 PM, Kevin Sweeney wrote: > +1 to complete removal > > On Wed, Oct 1, 2014 at 3:26 PM, Bill Farner wrote: > > > I've just created AURORA-775 [1] to track removal of the v1 client build,

Re: Proposal: remove v1 client as part of 0.6.0 release

2014-10-01 Thread Zameer Manji
+1 to complete removal. The plan is acceptable as well. On Wed, Oct 1, 2014 at 3:38 PM, Kevin Sweeney wrote: > +1 to complete removal > > On Wed, Oct 1, 2014 at 3:26 PM, Bill Farner wrote: > > > I've just created AURORA-775 [1] to track removal of the v1 client build, > > as a precursor to dele

Re: Monitoring capacity headroom

2014-10-01 Thread Bill Farner
Also side-note: in case you're not following AURORA-703 [1], there's a draft [2] of dedicated machines documentation up. [1] https://issues.apache.org/jira/browse/AURORA-703 [2] https://reviews.apache.org/r/26244/ -=Bill On Wed, Oct 1, 2014 at 3:07 PM, Josh Adams wrote: > Yep, makes sense. We

Running Aurora in Debug Mode on Vagrant

2014-10-01 Thread David Pan
Hi, I was wondering if there is a way to run Aurora in debug mode locally on vagrant. Specifically, I want to put breakpoints in the health checker in Aurora executor. Thanks, David Pan

Re: Proposal: remove v1 client as part of 0.6.0 release

2014-10-01 Thread Jake Farrell
Sounds good, +1 -Jake On Wed, Oct 1, 2014 at 6:26 PM, Bill Farner wrote: > I've just created AURORA-775 [1] to track removal of the v1 client build, > as a precursor to deleting the code. Given that we are close to a release, > does anyone have an opinion on whether this should happen before t

Re: Running Aurora in Debug Mode on Vagrant

2014-10-01 Thread Kevin Sweeney
Debug logging is likely to be your best bet here. That is: liberal use of log.debug and making sure the executor is started with --log_to_stderr=google:DEBUG On Wed, Oct 1, 2014 at 5:31 PM, David Pan wrote: > Hi, > > I was wondering if there is a way to run Aurora in debug mode locally on > vagr

Re: Running Aurora in Debug Mode on Vagrant

2014-10-01 Thread Kevin Sweeney
Outside the vagrant environment you can use pdb (add a line like import pdb; pdb.set_trace() at the line you want a breakpoint). On Wed, Oct 1, 2014 at 6:07 PM, Kevin Sweeney wrote: > Debug logging is likely to be your best bet here. That is: liberal

Re: Proposal: remove v1 client as part of 0.6.0 release

2014-10-01 Thread Bill Farner
Thanks, everyone. I've made AURORA-775 a blocker for the 0.6.0 release. -=Bill On Wed, Oct 1, 2014 at 5:48 PM, Jake Farrell wrote: > Sounds good, +1 > > -Jake > > On Wed, Oct 1, 2014 at 6:26 PM, Bill Farner wrote: > > > I've just created AURORA-775 [1] to track removal of the v1 client build,

Jenkins build is back to normal : Aurora #609

2014-10-01 Thread Apache Jenkins Server
See