[ https://issues.apache.org/jira/browse/FLINK-35739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Piotr Nowojski updated FLINK-35739: ----------------------------------- Component/s: Runtime / State Backends > FLIP-444: Native file copy support > ---------------------------------- > > Key: FLINK-35739 > URL: https://issues.apache.org/jira/browse/FLINK-35739 > Project: Flink > Issue Type: New Feature > Components: Connectors / FileSystem, Runtime / State Backends > Reporter: Piotr Nowojski > Priority: Major > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-444%3A+Native+file+copy+support > State downloading in Flink can be a time and CPU consuming operation, which > is especially visible if CPU resources per task slot are strictly restricted > to for example a single CPU. Downloading 1GB of state size can take > significant amount of time, while the code doing so is quite inefficient. > Currently when downloading state files, Flink is creating an > FSDataInputStream from the remote file, and copies its bytes, to an > OutputStream pointing to a local file (in the > RocksDBStateDownloader#downloadDataForStateHandle method). FSDataInputStream > internally is being wrapped by many layers of abstractions and indirections > and what’s worse, every file is being copied individually, which leads to > quite high overheads for small files. Download times and download process CPU > efficiency can be significantly improved if we introduced an API to allow > org.apache.flink.core.fs.FileSystem to copy many files natively and all at > once. > For S3, there are at least two potential implementations. The first one is > using AWS SDKv2 directly (Flink currently is using AWS SDKv1 wrapped by > hadoop/presto) and Amazon S3 Transfer Manager. Second option is to use a 3rd > party tool called s5cmd. It is claimed to be a faster alternative to the > official AWS clients, which was confirmed by our benchmarks. -- This message was sent by Atlassian Jira (v8.20.10#820010)