[ https://issues.apache.org/jira/browse/FLINK-19056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Chesnay Schepler updated FLINK-19056: ------------------------------------- Description: When using Netty 4.1.50 the multipart upload of files is more than a 100 times slower in the {{FileUploadHandlerTest}}. This test has traditionally been somewhat heavy, since it repeatedly tests the upload of 60mb files. On my machine this test currently finishes in 2-3 seconds, but with the upgraded Netty version it runs for several _minutes_ instead. I have not verified yet whether this is purely an issue of the test, but I would consider it unlikely. This would make Flink effectively unusable when uploading larger jars or JobGraphs. My theore is that is due to [this|https://github.com/netty/netty/pull/10226] change in Netty. Before this change, the {{HttpPostMultipartRequestDecoder}} was always creating unpooled heap buffers for _something_; after the change the buffer type is dependent on the input buffer. The input buffer is a direct one, so my conclusion is that with the upgrade we ended up allocating more direct buffers than we did previously. One solution I found was to explicitly create an {{UnpooledByteBufAllocator}} for the {{RestServerEndpoint}} that prefers heap buffers, which results in the input buffer to be a heap buffer, and thus we are never allocating direct ones. However, this should also imply that we are creating more heap buffers than we did in the previously; I don't know how much of a problem that is. Maybe this is even a good thing if it means less copies from direct to heap memory, but [this|https://github.com/netty/netty/blob/8f7ca2b4ef53b94607992494adf17a6237df7356/codec-http/src/main/java/io/netty/handler/codec/http/multipart/HttpPostMultipartRequestDecoder.java#L332] comment seems to indicate otherwise. On a somewhat related note, we could think about increasing the chunkSize from 8kb to 64kb to reduce the GC pressure a bit, along with some arenas for the REST API. was: When using Netty 4.1.50 the multipart upload of files is more than a 100 times slower in the {{FileUploadHandlerTest}}. This test has traditionally been somewhat heavy, since it repeatedly tests the upload of 60mb files. On my machine this test currently finishes in 2-3 seconds, but with the upgraded Netty version it runs for several _minutes_ instead. I have not verified yet whether this is purely an issue of the test, but I would consider it unlikely. This would make Flink effectively unusable when uploading larger jars or JobGraphs. My theore is that is due to [this|https://github.com/netty/netty/pull/10226] change in Netty. Before this change, the {{HttpPostMultipartRequestDecoder}} was always creating unpooled heap buffers for _something_; after the change the buffer type is dependent on the input buffer. The input buffer is a direct one, so my conclusion is that with the upgrade we ended up allocating more direct buffers than we did previously. One solution I found was to explicitly create an {{UnpooledByteBufAllocator}} for the {{RestServerEndpoint}} that prefers heap buffers, which results in the input buffer to be a heap buffer, and thus we are never allocating direct ones. However, this should also imply that we are creating more heap buffers than we did in the previously; I don't know how much of a problem that is. It would seem a reasonable thing to do since we at least should be able to skip a bunch of memory copies? On a somewhat related note, we could think about increasing the chunkSize from 8kb to 64kb to reduce the GC pressure a bit, along with some arenas for the REST API. > Investigate multipart upload performance regression > --------------------------------------------------- > > Key: FLINK-19056 > URL: https://issues.apache.org/jira/browse/FLINK-19056 > Project: Flink > Issue Type: Task > Components: Runtime / REST > Affects Versions: 1.12.0 > Reporter: Chesnay Schepler > Priority: Blocker > Fix For: 1.12.0 > > > When using Netty 4.1.50 the multipart upload of files is more than a 100 > times slower in the {{FileUploadHandlerTest}}. > This test has traditionally been somewhat heavy, since it repeatedly tests > the upload of 60mb files. > On my machine this test currently finishes in 2-3 seconds, but with the > upgraded Netty version it runs for several _minutes_ instead. I have not > verified yet whether this is purely an issue of the test, but I would > consider it unlikely. > This would make Flink effectively unusable when uploading larger jars or > JobGraphs. > > My theore is that is due to [this|https://github.com/netty/netty/pull/10226] > change in Netty. > Before this change, the {{HttpPostMultipartRequestDecoder}} was always > creating unpooled heap buffers for _something_; after the change the buffer > type is dependent on the input buffer. The input buffer is a direct one, so > my conclusion is that with the upgrade we ended up allocating more direct > buffers than we did previously. > > One solution I found was to explicitly create an {{UnpooledByteBufAllocator}} > for the {{RestServerEndpoint}} that prefers heap buffers, which results in > the input buffer to be a heap buffer, and thus we are never allocating direct > ones. > However, this should also imply that we are creating more heap buffers than > we did in the previously; I don't know how much of a problem that is. Maybe > this is even a good thing if it means less copies from direct to heap memory, > but > [this|https://github.com/netty/netty/blob/8f7ca2b4ef53b94607992494adf17a6237df7356/codec-http/src/main/java/io/netty/handler/codec/http/multipart/HttpPostMultipartRequestDecoder.java#L332] > comment seems to indicate otherwise. > > On a somewhat related note, we could think about increasing the chunkSize > from 8kb to 64kb to reduce the GC pressure a bit, along with some arenas for > the REST API. -- This message was sent by Atlassian Jira (v8.3.4#803005)