I've taken on the ticket and have a fix posted, hopefully to be committed today.
-=Bill On Wed, Jul 16, 2014 at 12:21 PM, Josh Adams <j...@foursquare.com> wrote: > +Leo Kim who is looking at the compiler error with us. > > > On Wed, Jul 16, 2014 at 8:25 AM, Kevin Burg <kb...@foursquare.com> wrote: > > > The idea with the fix is to read the slave's attributes right off the > > offer rather than going into 'AttributeStore' and keying on the slave's > > name. The slave's resources are read off the offer in this way, so I > don't > > see why it can't be done with attributes as well. > > > > Someone who understands all the places where SchedulingFilter.filter is > > used might be able to fix this better than I can. > > > > > > On Wed, Jul 16, 2014 at 6:40 AM, Josh Adams <j...@foursquare.com> wrote: > > > >> Hi there, > >> > >> Given that we would need to disrupt running jobs to add constraints in > >> the future we are blocking on > >> https://issues.apache.org/jira/browse/AURORA-582 before we can push any > >> of our services on to Aurora in production. > >> > >> Kevin Burg attempted to resolve the related bug > >> https://issues.apache.org/jira/browse/AURORA-328 by making some changes > >> here: > >> > https://github.com/foursquare/incubator-aurora/commit/b1962fad3fe9ef76954fa107abed25d78b809331 > >> but we seem to be getting a type mismatch when compiling the code. > >> > >> Any help and/or info on the bugfix progress would be much appreciated. > >> Aside from AURORA-582 we are ready to roll (pun intended!) > >> > >> Best, > >> Josh > >> > >> > >> On Mon, Jul 14, 2014 at 11:42 AM, Josh Adams <j...@foursquare.com> > wrote: > >> > >>> Ah, makes sense. We'll try that. Thanks for clarifying this Kevin. > >>> > >>> Josh > >>> > >>> > >>> On Mon, Jul 14, 2014 at 11:30 AM, Kevin Sweeney <kevi...@apache.org> > >>> wrote: > >>> > >>>> Slaves persist their attributes (including attributes) across restarts > >>>> due to slave recovery (that's what allows you to upgrade mesos > in-place > >>>> without killing the tasks they're managing). Unfortunately to change > >>>> attributes you need to remove persisted slave metadata (the "meta" > >>>> directory). This will kill all of a slave's underlying tasks but the > newly > >>>> registered slave should have the correct attributes. > >>>> > >>>> > >>>> On Mon, Jul 14, 2014 at 11:26 AM, Kevin Burg <kb...@foursquare.com> > >>>> wrote: > >>>> > >>>>> I've confirmed by looking at that endpoint that new attributes are > not > >>>>> being picked up and modified attributes are retaining their old > values. > >>>>> This is after restarting both the slaves and the scheduler process. > >>>>> > >>>>> > >>>>> On Mon, Jul 14, 2014 at 11:09 AM, Josh Adams <j...@foursquare.com> > >>>>> wrote: > >>>>> > >>>>> > Thanks Brian. Kevin should have some followup questions shortly. > >>>>> > > >>>>> > Josh > >>>>> > > >>>>> > > >>>>> > On Mon, Jul 14, 2014 at 10:37 AM, Brian Wickman < > wick...@apache.org> > >>>>> > wrote: > >>>>> > > >>>>> >> host/rack should not be treated specially. > >>>>> >> > >>>>> >> If you go to the "/slaves" endpoint on the scheduler UI, what does > >>>>> it > >>>>> >> report as attributes being exported by your slaves? You might > want > >>>>> to > >>>>> >> validate there that the "staging" attribute got picked up > properly. > >>>>> If > >>>>> >> it's not getting picked up (e.g. the attributes are getting cached > >>>>> >> incorrectly by the scheduler?) then you should file an issue. > >>>>> >> > >>>>> >> > >>>>> >> On Fri, Jul 11, 2014 at 5:24 PM, Kevin Burg <kb...@foursquare.com > > > >>>>> wrote: > >>>>> >> > >>>>> >>> Hi, > >>>>> >>> > >>>>> >>> I'm having trouble getting the task constraint resolver worker > with > >>>>> >>> attributes other than 'host' and 'rack.' Are arbitrary attribute > >>>>> keys in > >>>>> >>> the mesos slaves supported currently? > >>>>> >>> > >>>>> >>> Here is the setup. > >>>>> >>> > >>>>> >>> The slaves are configured to run with > >>>>> >>> `--attributes=host:<host>;rack:<rack>;staging:true` > >>>>> >>> > >>>>> >>> (I've also tried this with staging:1, and staging:foo) > >>>>> >>> > >>>>> >>> The constraint generated from the .aurora config looks like the > >>>>> following > >>>>> >>> Constraint(name:staging, constraint:<TaskConstraint > >>>>> >>> value:ValueConstraint(negated:false, values:[true])>) > >>>>> >>> > >>>>> >>> The schedule request then gets vetoed with the following veto > >>>>> object: > >>>>> >>> Veto{reason=Constraint not satisfied: staging, score=1000, > >>>>> >>> valueMismatch=true}] > >>>>> >>> > >>>>> >>> The constraints generated for 'host' and 'rack' look identical > >>>>> except for > >>>>> >>> the different name of course. I've even tried bouncing every > mesos > >>>>> and > >>>>> >>> aurora process on the machine to see if maybe stale attributes > >>>>> were being > >>>>> >>> assigned to the slaves. All the offers being made to the master > >>>>> look > >>>>> >>> correct though, which leads me to believe that the constraint > >>>>> solver just > >>>>> >>> doesn't work for arbitrary attributes. > >>>>> >>> > >>>>> >>> We would appreciate any help you can offer. > >>>>> >>> > >>>>> >>> Thanks, > >>>>> >>> Kevin > >>>>> >>> > >>>>> >> > >>>>> >> > >>>>> > > >>>>> > > >>>>> > -- > >>>>> > =============== > >>>>> > josh adams > >>>>> > production engineer > >>>>> > foursquare > >>>>> > > >>>>> > (gv) 415-830-4106 > >>>>> > =============== > >>>>> > foursquare.com/jobs > >>>>> > > >>>>> > >>>> > >>>> > >>> > >>> > >>> -- > >>> =============== > >>> josh adams > >>> production engineer > >>> foursquare > >>> > >>> (gv) 415-830-4106 > >>> =============== > >>> foursquare.com/jobs > >>> > >> > >> > >> > >> -- > >> =============== > >> josh adams > >> production engineer > >> foursquare > >> > >> (gv) 415-830-4106 > >> =============== > >> foursquare.com/jobs > >> > > > > > > > -- > =============== > josh adams > production engineer > foursquare > > (gv) 415-830-4106 > =============== > foursquare.com/jobs >