[ 
https://issues.apache.org/jira/browse/FLINK-30471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuxin Tan updated FLINK-30471:
------------------------------
    Description: 
In SsgNetworkMemoryCalculationUtils#enrichNetworkMemory, getting PartitionTypes 
is run in a separate loop, which is not friendly to performance.  If we want to 
add inputPartitionTypes in the subsequential PR, a new separate loop may be 
introduced too, which I think is not a good choice.

Using a separate loop to get each collection just looks simpler in code style, 
but it will affect the performance. We can get all the results of 
maxSubpartitionNums and partitionTypes through one loop instead of multiple 
loops, which will be faster. In this way, when we need to add 
inputPartitionTypes later, we do not need to add a new loop logic.

  was:
In SsgNetworkMemoryCalculationUtils#enrichNetworkMemory, getting PartitionTypes 
is run in a separate loop, which is not friendly to performance. If we want to 
get inputPartitionTypes, a new separate loop may be introduced too. 

It just looks simpler in code, but it will affect the performance. We can get 
all the results through one loop instead of multiple loops, which will be 
faster.


> Optimize the enriching network memory process in 
> SsgNetworkMemoryCalculationUtils
> ---------------------------------------------------------------------------------
>
>                 Key: FLINK-30471
>                 URL: https://issues.apache.org/jira/browse/FLINK-30471
>             Project: Flink
>          Issue Type: Sub-task
>          Components: Runtime / Network
>    Affects Versions: 1.17.0
>            Reporter: Yuxin Tan
>            Priority: Major
>              Labels: pull-request-available
>
> In SsgNetworkMemoryCalculationUtils#enrichNetworkMemory, getting 
> PartitionTypes is run in a separate loop, which is not friendly to 
> performance.  If we want to add inputPartitionTypes in the subsequential PR, 
> a new separate loop may be introduced too, which I think is not a good choice.
> Using a separate loop to get each collection just looks simpler in code 
> style, but it will affect the performance. We can get all the results of 
> maxSubpartitionNums and partitionTypes through one loop instead of multiple 
> loops, which will be faster. In this way, when we need to add 
> inputPartitionTypes later, we do not need to add a new loop logic.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to