[jira] [Updated] (FLINK-19056) Investigate multipart upload performance regression

Chesnay Schepler (Jira) Wed, 26 Aug 2020 16:37:13 -0700


     [ 
https://issues.apache.org/jira/browse/FLINK-19056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Chesnay Schepler updated FLINK-19056:
-------------------------------------
    Description: 
When using Netty 4.1.50 the multipart upload of files is more than a 100 times 
slower in the {{FileUploadHandlerTest}}.

This test has traditionally been somewhat heavy, since it repeatedly tests the 
upload of 60mb files.

On my machine this test currently finishes in 2-3 seconds, but with the 
upgraded Netty version it runs for several _minutes_ instead. I have not 
verified yet whether this is purely an issue of the test, but I would consider 
it unlikely.

This would make Flink effectively unusable when uploading larger jars or 
JobGraphs.

 

My theore is that is due to [this|https://github.com/netty/netty/pull/10226] 
change in Netty.

Before this change, the {{HttpPostMultipartRequestDecoder}} was always creating 
unpooled heap buffers for _something_; after the change the buffer type is 
dependent on the input buffer. The input buffer is a direct one, so my 
conclusion is that with the upgrade we ended up allocating more direct buffers 
than we did previously.

 

One solution I found was to explicitly create an {{UnpooledByteBufAllocator}} 
for the {{RestServerEndpoint}} that prefers heap buffers, which results in the 
input buffer to be a heap buffer, and thus we are never allocating direct ones.

However, this should also imply that we are creating more heap buffers than we 
did in the previously; I don't know how much of a problem that is. Maybe this 
is even a good thing if it means less copies from direct to heap memory, but 
[this|https://github.com/netty/netty/blob/8f7ca2b4ef53b94607992494adf17a6237df7356/codec-http/src/main/java/io/netty/handler/codec/http/multipart/HttpPostMultipartRequestDecoder.java#L332]
 comment seems to indicate otherwise.

 

On a somewhat related note, we could think about increasing the chunkSize from 
8kb to 64kb to reduce the GC pressure a bit, along with some arenas for the 
REST API.

  was:
When using Netty 4.1.50 the multipart upload of files is more than a 100 times 
slower in the {{FileUploadHandlerTest}}.

This test has traditionally been somewhat heavy, since it repeatedly tests the 
upload of 60mb files.

On my machine this test currently finishes in 2-3 seconds, but with the 
upgraded Netty version it runs for several _minutes_ instead. I have not 
verified yet whether this is purely an issue of the test, but I would consider 
it unlikely.

This would make Flink effectively unusable when uploading larger jars or 
JobGraphs.

 

My theore is that is due to [this|https://github.com/netty/netty/pull/10226] 
change in Netty.

Before this change, the {{HttpPostMultipartRequestDecoder}} was always creating 
unpooled heap buffers for _something_; after the change the buffer type is 
dependent on the input buffer. The input buffer is a direct one, so my 
conclusion is that with the upgrade we ended up allocating more direct buffers 
than we did previously.

 

One solution I found was to explicitly create an {{UnpooledByteBufAllocator}} 
for the {{RestServerEndpoint}} that prefers heap buffers, which results in the 
input buffer to be a heap buffer, and thus we are never allocating direct ones.

However, this should also imply that we are creating more heap buffers than we 
did in the previously; I don't know how much of a problem that is. It would 
seem a reasonable thing to do since we at least should be able to skip a bunch 
of memory copies?

 

On a somewhat related note, we could think about increasing the chunkSize from 
8kb to 64kb to reduce the GC pressure a bit, along with some arenas for the 
REST API.


> Investigate multipart upload performance regression
> ---------------------------------------------------
>
>                 Key: FLINK-19056
>                 URL: https://issues.apache.org/jira/browse/FLINK-19056
>             Project: Flink
>          Issue Type: Task
>          Components: Runtime / REST
>    Affects Versions: 1.12.0
>            Reporter: Chesnay Schepler
>            Priority: Blocker
>             Fix For: 1.12.0
>
>
> When using Netty 4.1.50 the multipart upload of files is more than a 100 
> times slower in the {{FileUploadHandlerTest}}.
> This test has traditionally been somewhat heavy, since it repeatedly tests 
> the upload of 60mb files.
> On my machine this test currently finishes in 2-3 seconds, but with the 
> upgraded Netty version it runs for several _minutes_ instead. I have not 
> verified yet whether this is purely an issue of the test, but I would 
> consider it unlikely.
> This would make Flink effectively unusable when uploading larger jars or 
> JobGraphs.
>  
> My theore is that is due to [this|https://github.com/netty/netty/pull/10226] 
> change in Netty.
> Before this change, the {{HttpPostMultipartRequestDecoder}} was always 
> creating unpooled heap buffers for _something_; after the change the buffer 
> type is dependent on the input buffer. The input buffer is a direct one, so 
> my conclusion is that with the upgrade we ended up allocating more direct 
> buffers than we did previously.
>  
> One solution I found was to explicitly create an {{UnpooledByteBufAllocator}} 
> for the {{RestServerEndpoint}} that prefers heap buffers, which results in 
> the input buffer to be a heap buffer, and thus we are never allocating direct 
> ones.
> However, this should also imply that we are creating more heap buffers than 
> we did in the previously; I don't know how much of a problem that is. Maybe 
> this is even a good thing if it means less copies from direct to heap memory, 
> but 
> [this|https://github.com/netty/netty/blob/8f7ca2b4ef53b94607992494adf17a6237df7356/codec-http/src/main/java/io/netty/handler/codec/http/multipart/HttpPostMultipartRequestDecoder.java#L332]
>  comment seems to indicate otherwise.
>  
> On a somewhat related note, we could think about increasing the chunkSize 
> from 8kb to 64kb to reduce the GC pressure a bit, along with some arenas for 
> the REST API.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-19056) Investigate multipart upload performance regression

Reply via email to