[ https://issues.apache.org/jira/browse/AURORA-116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13881652#comment-13881652 ]
Bill Farner commented on AURORA-116: ------------------------------------ In a particularly-loaded cluster, we observed handling resourceOffers to take an inordinate amount of time: {noformat} I0125 04:06:06.030261 25455 sched.cpp:528] Scheduler::resourceOffers took 457.474384ms I0125 04:06:07.431802 25456 sched.cpp:528] Scheduler::resourceOffers took 499.004174ms I0125 04:06:08.715579 25452 sched.cpp:528] Scheduler::resourceOffers took 426.892948ms I0125 04:06:10.422312 25458 sched.cpp:528] Scheduler::resourceOffers took 788.509018ms I0125 04:06:11.531437 25450 sched.cpp:528] Scheduler::resourceOffers took 556.181547ms I0125 04:06:12.849201 25452 sched.cpp:528] Scheduler::resourceOffers took 557.399593ms I0125 04:06:14.131196 25446 sched.cpp:528] Scheduler::resourceOffers took 506.654534ms I0125 04:06:15.558323 25457 sched.cpp:528] Scheduler::resourceOffers took 603.352069ms I0125 04:06:16.797667 25454 sched.cpp:528] Scheduler::resourceOffers took 507.040296ms I0125 04:06:18.342701 25449 sched.cpp:528] Scheduler::resourceOffers took 718.241925ms I0125 04:06:22.795732 25445 sched.cpp:528] Scheduler::resourceOffers took 3.859263212secs I0125 04:06:23.649204 25445 sched.cpp:528] Scheduler::resourceOffers took 838.23624ms I0125 04:06:24.176681 25445 sched.cpp:528] Scheduler::resourceOffers took 522.324683ms I0125 04:06:24.709750 25455 sched.cpp:528] Scheduler::resourceOffers took 328.162458ms I0125 04:06:25.272554 25455 sched.cpp:528] Scheduler::resourceOffers took 559.136627ms I0125 04:06:26.167621 25455 sched.cpp:528] Scheduler::resourceOffers took 875.709069ms I0125 04:06:26.493263 25443 sched.cpp:528] Scheduler::resourceOffers took 134.088104ms I0125 04:06:28.426606 25445 sched.cpp:528] Scheduler::resourceOffers took 420.597132ms I0125 04:06:31.088995 25446 sched.cpp:528] Scheduler::resourceOffers took 1.262336563secs I0125 04:06:31.934573 25456 sched.cpp:528] Scheduler::resourceOffers took 832.207275ms I0125 04:06:33.181200 25454 sched.cpp:528] Scheduler::resourceOffers took 765.264834ms I0125 04:06:35.013409 25457 sched.cpp:528] Scheduler::resourceOffers took 1.250660605secs I0125 04:06:35.311099 25448 sched.cpp:528] Scheduler::resourceOffers took 230.108961ms I0125 04:06:37.104035 25451 sched.cpp:528] Scheduler::resourceOffers took 624.894808ms I0125 04:06:38.150378 25450 sched.cpp:528] Scheduler::resourceOffers took 427.445204ms I0125 04:06:39.383716 25457 sched.cpp:528] Scheduler::resourceOffers took 392.365989ms {noformat} > Improve efficiency of saving host attributes (or avoid saving host attributes) > ------------------------------------------------------------------------------ > > Key: AURORA-116 > URL: https://issues.apache.org/jira/browse/AURORA-116 > Project: Aurora > Issue Type: Task > Components: Scheduler > Reporter: Bill Farner > Priority: Critical > > The scheduler performs multiple write operations for every resource offer, to > save slave attributes: > {noformat} > public void resourceOffers(SchedulerDriver driver, List<Offer> offers) { > Preconditions.checkState(registered, "Must be registered before receiving > offers."); > for (final Offer offer : offers) { > log(Level.FINE, "Received offer: %s", offer); > resourceOffers.incrementAndGet(); > storage.write(new MutateWork.NoResult.Quiet() { > @Override protected void execute(MutableStoreProvider storeProvider) { > > storeProvider.getAttributeStore().saveHostAttributes(Conversions.getAttributes(offer)); > } > }); > {noformat} > This can unnecessarily block the singly-threaded message dispatch in the > scheduler driver. An incremental improvement would be to aggregate all slave > info and save it in one write operation. Better yet would be to perform > writes asynchronously (taking care to not break task scheduling, since > attributes are expected to be present). Even better yet, it would be great > to determine if we can avoid storing host attributes. -- This message was sent by Atlassian JIRA (v6.1.5#6160)