[ 
https://issues.apache.org/jira/browse/FLINK-14594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Till Rohrmann updated FLINK-14594:
----------------------------------
    Release Note: Serialized `JobGraphs` which set the `ResourceSpec` created 
by Flink versions < 1.10 are no longer compatible with Flink >= 1.10. If you 
want to migrate these jobs to Flink 1.10.0 you will have to stop the job with a 
savepoint and then resume it from this savepoint on the Flink 1.10.0 cluster.  
(was: Serialized `JobGraphs` which set the `ResourceSpec` create by Flink 
versions < 1.10 are no longer compatible with Flink >= 1.10. If you want to 
migrate these jobs to Flink 1.10.0 you will have to stop the job with a 
savepoint and then resume it from this savepoint on the Flink 1.10.0 cluster.)

> Fix matching logics of ResourceSpec/ResourceProfile/Resource considering 
> double values
> --------------------------------------------------------------------------------------
>
>                 Key: FLINK-14594
>                 URL: https://issues.apache.org/jira/browse/FLINK-14594
>             Project: Flink
>          Issue Type: Sub-task
>          Components: Runtime / Coordination
>    Affects Versions: 1.10.0
>            Reporter: Zhu Zhu
>            Assignee: Zhu Zhu
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 1.10.0
>
>          Time Spent: 20m
>  Remaining Estimate: 0h
>
> There are resources of double type values, like cpuCores in 
> ResourceSpec/ResourceProfiles or all extended resources. These values can be 
> generated via a merge or subtract, so that there can be small deltas.
> Currently, in resource matching, these resources are matched without 
> considering the deltas, which may result in issues as below:
> 1. A shared slot cannot fulfill a slot request even if it should be able to 
> (because it is possible that {{(d1 + d2) - d1 < d2}} for double values)
> 2. if a shared slot is used up, an unexpected error may occur when 
> calculating its remaining resources in 
> SlotSharingManager#listResolvedRootSlotInfo -> ResourceProfile#subtract
> 3. an unexpected error may happen when releasing a single task slot from a 
> shared slot (in ResourceProfile#subtract)
> To solve this issue, I'd propose to:
> 1. Change {{Resource}} to use {{BigDecimal}} to manage double values. This 
> enabled the values able to be strictly compared, and able to be additively 
> merged/subtracted with no precision loss. Extended resources can work 
> correctly with double values with this change.
> 2. Introduce {{CPUResource}} to represent cpu cores. It is based on 
> {{Resource}}
> 3. Change ResourceSpec/ResourceProfile to use CPUResource for cpu cores



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to