Hi Yes, for your use case, if you do not have large state size, you can try to use FsStateBackend. Best, Congxian
ysnakie <ysna...@hotmail.com> 于2020年4月27日周一 下午3:42写道: > Hi > If I use FsStateBackend instead of RocksdbFsStateBackend, will the open > files decrease significantly? I dont have large state size. > > thanks > On 4/25/2020 13:48,Congxian Qiu<qcx978132...@gmail.com> > <qcx978132...@gmail.com> wrote: > > Hi > If there are indeed so many files need to upload to hdfs, then currently > we do not have any solutions to limit the open files, there exist an > issue[1] wants to fix this problem, and a pr for it, maybe you can try the > attached pr to try it can solve your problem. > > [1] https://issues.apache.org/jira/browse/FLINK-11937 > Best, > Congxian > > > ysnakie <ysna...@hotmail.com> 于2020年4月24日周五 下午11:30写道: > >> Hi everyone >> We have a Flink Job to write files to HDFS's different directories. It >> will open many files due to its high parallelism. I also found that if >> using rocksdb state backend, it will have even more files open during the >> checkpointing. We use yarn to schedule Flink job. However yarn always >> schedule taskmanagers to the same machine and I cannot control it! So the >> datanode will get very very high pressure and always throw a "bad link" >> error. We hava already increase the xiceviers limit of HDFS to 16384 >> >> Any idea to solve this problem? reduce the number of opening file or >> control the yarn scheduling to put taskmanager on different machines! >> >> Thank you very much! >> regards >> >> Shengnan >> >>