[ 
https://issues.apache.org/jira/browse/FLINK-8431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16325364#comment-16325364
 ] 

Dongwon Kim commented on FLINK-8431:
------------------------------------

Eron, I saw a discussion on 
[GPU_RESOURCES|https://www.mail-archive.com/dev@mesos.apache.org/msg37571.html] 
and [MESOS-7576|https://issues.apache.org/jira/browse/MESOS-7576]. 
{{GPU_RESOURCES}} is going to be deprecated in favor of the reservation 
mechanism ([MESOS-7574|https://issues.apache.org/jira/browse/MESOS-7574]). 
Thanks to it, I can launch Flink sessions by starting Mesos agents with 
{{--filter_gpu_resources}} set to false. It allows Flink to get resource offers 
from GPU nodes even though the current implementation of Flink's Mesos 
scheduler does not enable {{GPU_RESOURCES}} framework capability.

Nevertheless, it seems that we need to enable {{GPU_RESOURCES}} framework 
capability before it is completely deprecated. This is because many users could 
still use Mesos<1.4.0.  
[MESOS-7576|https://issues.apache.org/jira/browse/MESOS-7576] is a relatively 
new issue and takes effect from 
[Mesos-1.4.0|https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=blob_plain;f=CHANGELOG;hb=1.4.0].
 So I plan to enable {{GPU_RESOURCES}} framework capability when 
{{mesos.resourcemanager.tasks.gpus}} is set (>0).

> Allow to specify # GPUs for TaskManager in Mesos
> ------------------------------------------------
>
>                 Key: FLINK-8431
>                 URL: https://issues.apache.org/jira/browse/FLINK-8431
>             Project: Flink
>          Issue Type: Improvement
>          Components: Cluster Management, Mesos
>            Reporter: Dongwon Kim
>            Assignee: Dongwon Kim
>            Priority: Minor
>
> Mesos provides first-class support for Nvidia GPUs [1], but Flink does not 
> exploit it when scheduling TaskManagers. If Mesos agents are configured to 
> isolate GPUs as shown in [2], TaskManagers that do not specify to use GPUs 
> cannot see GPUs at all.
> We, therefore, need to introduce a new configuration property named 
> "mesos.resourcemanager.tasks.gpus" to allow users to specify # of GPUs for 
> each TaskManager process in Mesos.
> [1] http://mesos.apache.org/documentation/latest/gpu-support/
> [2] http://mesos.apache.org/documentation/latest/gpu-support/#agent-flags



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to