Hi
We are trying to improve our LLAP performance on our cluster but we've
noticed that event though LLAP daemon containers get configured memory,
they get only 1 vcore per container.
We are running 10 LLAP deamons using Slider. There are no other containers
running on the nodes that run LLAP daemons and there are 0 memory available
but 43 vcores running idle.

I can see the following lines on Slider logs so I suspect SliderAppMaster
doesn't request vcores from Yarn:

2018-10-16 18:38:42,503 [AmExecutor-006] INFO  appmaster.SliderAppMaster -
Registered service under /users/hive/services/org-apache-slider/llap0;
absolute path /registry/users/hive/services/org-apache-slider/llap0
2018-10-16 18:38:42,510 [AmExecutor-006] INFO  state.AppState - Reviewing
RoleStatus{name='LLAP', group=LLAP, key=1, desired=10, actual=0,
requested=0, releasing=0, failed=0, startFailed=0, started=0, completed=0,
totalRequested=0, preempted=0, nodeFailed=0, failedRecently=0,
limitsExceeded=0, resourceRequirements=<memory:445440, vCores:1>,
isAntiAffinePlacement=false, failureMessage='',
providerRole=ProviderRole{name='LLAP', group=LLAP, id=1, placementPolicy=0,
nodeFailureThreshold=3, placementTimeoutSeconds=30,
labelExpression='null'}, failedContainers=[],
healthThresholdMonitorEnabled=true} :
2018-10-16 18:38:42,510 [AmExecutor-006] INFO  state.AppState - LLAP:
Asking for 10 more nodes(s) for a total of 10
2018-10-16 18:38:42,512 [AmExecutor-006] INFO  state.AppState - Container
ask is Capability[<memory:445440, vCores:1>]Priority[1073741825] and label
= null
2018-10-16 18:38:42,513 [AmExecutor-006] INFO  state.AppState - Container
ask is Capability[<memory:445440, vCores:1>]Priority[1073741825] and label
= null
2018-10-16 18:38:42,513 [AmExecutor-006] INFO  state.AppState - Container
ask is Capability[<memory:445440, vCores:1>]Priority[1073741825] and label
= null
2018-10-16 18:38:42,513 [AmExecutor-006] INFO  state.AppState - Container
ask is Capability[<memory:445440, vCores:1>]Priority[1073741825] and label
= null
2018-10-16 18:38:42,513 [AmExecutor-006] INFO  state.AppState - Container
ask is Capability[<memory:445440, vCores:1>]Priority[1073741825] and label
= null
2018-10-16 18:38:42,513 [AmExecutor-006] INFO  state.AppState - Container
ask is Capability[<memory:445440, vCores:1>]Priority[1073741825] and label
= null
2018-10-16 18:38:42,513 [AmExecutor-006] INFO  state.AppState - Container
ask is Capability[<memory:445440, vCores:1>]Priority[1073741825] and label
= null
2018-10-16 18:38:42,513 [AmExecutor-006] INFO  state.AppState - Container
ask is Capability[<memory:445440, vCores:1>]Priority[1073741825] and label
= null
2018-10-16 18:38:42,514 [AmExecutor-006] INFO  state.AppState - Container
ask is Capability[<memory:445440, vCores:1>]Priority[1073741825] and label
= null
2018-10-16 18:38:42,514 [AmExecutor-006] INFO  state.AppState - Container
ask is Capability[<memory:445440, vCores:1>]Priority[1073741825] and label
= null

And here is the configuration output from same log file that might be
relevant:

 "credentials" : { },
  "components" : {
    "LLAP" : {
      "yarn.container.health.threshold.init.delay.secs" : "400",
      "yarn.role.priority" : "1",
      "yarn.component.instances" : "10",
      "yarn.memory" : "445440",
      "yarn.resource.normalization.enabled" : "false",
      "yarn.container.health.threshold.window.secs" : "300",
      "yarn.component.placement.policy" : "0",
      "yarn.container.health.threshold.percent" : "80"
    },
    "slider-appmaster" : {
      "yarn.vcores" : "1",
      "yarn.component.instances" : "1",
      "yarn.memory" : "1024"
    }
  }
},

yarn.nodemanager.resource.cpu-vcores,
yarn.scheduler.maximum-allocation-vcores,
hive.llap.daemon.vcpus.per.instance,
hive.llap.daemon.num.executors are all set to 44.

We can confirm 44 executors running per instance on LLAP Daemon web UI.

We are using HDP 2.7.3.2.6.4.0-91 with YARN 2.7.3, Hive 1.2.1000, Slider
0.92.0.

Any ideas how to utilize more CPU with LLAP daemons?

Thanks.

Reply via email to