The idea with the fix is to read the slave's attributes right off the offer rather than going into 'AttributeStore' and keying on the slave's name. The slave's resources are read off the offer in this way, so I don't see why it can't be done with attributes as well.
Someone who understands all the places where SchedulingFilter.filter is used might be able to fix this better than I can. On Wed, Jul 16, 2014 at 6:40 AM, Josh Adams <j...@foursquare.com> wrote: > Hi there, > > Given that we would need to disrupt running jobs to add constraints in the > future we are blocking on https://issues.apache.org/jira/browse/AURORA-582 > before we can push any of our services on to Aurora in production. > > Kevin Burg attempted to resolve the related bug > https://issues.apache.org/jira/browse/AURORA-328 by making some changes > here: > https://github.com/foursquare/incubator-aurora/commit/b1962fad3fe9ef76954fa107abed25d78b809331 > but we seem to be getting a type mismatch when compiling the code. > > Any help and/or info on the bugfix progress would be much appreciated. > Aside from AURORA-582 we are ready to roll (pun intended!) > > Best, > Josh > > > On Mon, Jul 14, 2014 at 11:42 AM, Josh Adams <j...@foursquare.com> wrote: > >> Ah, makes sense. We'll try that. Thanks for clarifying this Kevin. >> >> Josh >> >> >> On Mon, Jul 14, 2014 at 11:30 AM, Kevin Sweeney <kevi...@apache.org> >> wrote: >> >>> Slaves persist their attributes (including attributes) across restarts >>> due to slave recovery (that's what allows you to upgrade mesos in-place >>> without killing the tasks they're managing). Unfortunately to change >>> attributes you need to remove persisted slave metadata (the "meta" >>> directory). This will kill all of a slave's underlying tasks but the newly >>> registered slave should have the correct attributes. >>> >>> >>> On Mon, Jul 14, 2014 at 11:26 AM, Kevin Burg <kb...@foursquare.com> >>> wrote: >>> >>>> I've confirmed by looking at that endpoint that new attributes are not >>>> being picked up and modified attributes are retaining their old values. >>>> This is after restarting both the slaves and the scheduler process. >>>> >>>> >>>> On Mon, Jul 14, 2014 at 11:09 AM, Josh Adams <j...@foursquare.com> >>>> wrote: >>>> >>>> > Thanks Brian. Kevin should have some followup questions shortly. >>>> > >>>> > Josh >>>> > >>>> > >>>> > On Mon, Jul 14, 2014 at 10:37 AM, Brian Wickman <wick...@apache.org> >>>> > wrote: >>>> > >>>> >> host/rack should not be treated specially. >>>> >> >>>> >> If you go to the "/slaves" endpoint on the scheduler UI, what does it >>>> >> report as attributes being exported by your slaves? You might want >>>> to >>>> >> validate there that the "staging" attribute got picked up properly. >>>> If >>>> >> it's not getting picked up (e.g. the attributes are getting cached >>>> >> incorrectly by the scheduler?) then you should file an issue. >>>> >> >>>> >> >>>> >> On Fri, Jul 11, 2014 at 5:24 PM, Kevin Burg <kb...@foursquare.com> >>>> wrote: >>>> >> >>>> >>> Hi, >>>> >>> >>>> >>> I'm having trouble getting the task constraint resolver worker with >>>> >>> attributes other than 'host' and 'rack.' Are arbitrary attribute >>>> keys in >>>> >>> the mesos slaves supported currently? >>>> >>> >>>> >>> Here is the setup. >>>> >>> >>>> >>> The slaves are configured to run with >>>> >>> `--attributes=host:<host>;rack:<rack>;staging:true` >>>> >>> >>>> >>> (I've also tried this with staging:1, and staging:foo) >>>> >>> >>>> >>> The constraint generated from the .aurora config looks like the >>>> following >>>> >>> Constraint(name:staging, constraint:<TaskConstraint >>>> >>> value:ValueConstraint(negated:false, values:[true])>) >>>> >>> >>>> >>> The schedule request then gets vetoed with the following veto >>>> object: >>>> >>> Veto{reason=Constraint not satisfied: staging, score=1000, >>>> >>> valueMismatch=true}] >>>> >>> >>>> >>> The constraints generated for 'host' and 'rack' look identical >>>> except for >>>> >>> the different name of course. I've even tried bouncing every mesos >>>> and >>>> >>> aurora process on the machine to see if maybe stale attributes were >>>> being >>>> >>> assigned to the slaves. All the offers being made to the master look >>>> >>> correct though, which leads me to believe that the constraint >>>> solver just >>>> >>> doesn't work for arbitrary attributes. >>>> >>> >>>> >>> We would appreciate any help you can offer. >>>> >>> >>>> >>> Thanks, >>>> >>> Kevin >>>> >>> >>>> >> >>>> >> >>>> > >>>> > >>>> > -- >>>> > =============== >>>> > josh adams >>>> > production engineer >>>> > foursquare >>>> > >>>> > (gv) 415-830-4106 >>>> > =============== >>>> > foursquare.com/jobs >>>> > >>>> >>> >>> >> >> >> -- >> =============== >> josh adams >> production engineer >> foursquare >> >> (gv) 415-830-4106 >> =============== >> foursquare.com/jobs >> > > > > -- > =============== > josh adams > production engineer > foursquare > > (gv) 415-830-4106 > =============== > foursquare.com/jobs >