Re: checkpointing opening too many file

Congxian Qiu Wed, 06 May 2020 03:36:42 -0700

Hi

Yes, for your use case, if you do not have large state size, you can try to
use FsStateBackend.
Best,
Congxian



ysnakie <ysna...@hotmail.com> 于2020年4月27日周一 下午3:42写道：

> Hi
> If I use FsStateBackend instead of RocksdbFsStateBackend, will the open
> files decrease significantly? I dont have large state size.
>
> thanks
> On 4/25/2020 13:48，Congxian Qiu<qcx978132...@gmail.com>
> <qcx978132...@gmail.com> wrote：
>
> Hi
> If there are indeed so many files need to upload to hdfs, then currently
> we do not have any solutions to limit the open files, there exist an
> issue[1] wants to fix this problem, and a pr for it, maybe you can try the
> attached pr to try it can solve your problem.
>
> [1] https://issues.apache.org/jira/browse/FLINK-11937
> Best,
> Congxian
>
>
> ysnakie <ysna...@hotmail.com> 于2020年4月24日周五 下午11:30写道：
>
>> Hi everyone
>> We have a Flink Job to write files to HDFS's different directories. It
>> will open many files due to its high parallelism. I also found that if
>> using rocksdb state backend, it will have even more files open during the
>> checkpointing.  We use yarn to schedule Flink job. However yarn always
>> schedule taskmanagers to the same machine and I cannot control it! So the
>> datanode will get very very high pressure and always throw a "bad link"
>> error.  We hava already increase the xiceviers limit of HDFS to 16384
>>
>> Any idea to solve this problem? reduce the number of opening file or
>> control the yarn scheduling to put taskmanager on different machines!
>>
>> Thank you very much!
>> regards
>>
>> Shengnan
>>
>>

Re: checkpointing opening too many file

Reply via email to