[ https://issues.apache.org/jira/browse/FLINK-6873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Flink Jira Bot updated FLINK-6873: ---------------------------------- Labels: auto-deprioritized-major auto-deprioritized-minor (was: auto-deprioritized-major stale-minor) Priority: Not a Priority (was: Minor) This issue was labeled "stale-minor" 7 days ago and has not received any updates so it is being deprioritized. If this ticket is actually Minor, please raise the priority and ask a committer to assign you the issue or revive the public discussion. > Limit the number of open writers in file system connector > --------------------------------------------------------- > > Key: FLINK-6873 > URL: https://issues.apache.org/jira/browse/FLINK-6873 > Project: Flink > Issue Type: Improvement > Components: Connectors / Common, Connectors / FileSystem, Runtime / > Task > Reporter: Mu Kong > Priority: Not a Priority > Labels: auto-deprioritized-major, auto-deprioritized-minor > > Mail list discuss: > https://mail.google.com/mail/u/1/#label/MailList%2Fflink-dev/15c869b2a5b20d43 > Following exception will occur when Flink is writing to too many files: > {code} > java.lang.OutOfMemoryError: unable to create new native thread > at java.lang.Thread.start0(Native Method) > at java.lang.Thread.start(Thread.java:714) > at org.apache.hadoop.hdfs.DFSOutputStream.start(DFSOutputStream.java:2170) > at > org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:1685) > at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1689) > at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1624) > at > org.apache.hadoop.hdfs.DistributedFileSystem$7.doCall(DistributedFileSystem.java:448) > at > org.apache.hadoop.hdfs.DistributedFileSystem$7.doCall(DistributedFileSystem.java:444) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:459) > at > org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:387) > at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:909) > at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:890) > at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:787) > at > org.apache.flink.streaming.connectors.fs.StreamWriterBase.open(StreamWriterBase.java:120) > at > org.apache.flink.streaming.connectors.fs.StringWriter.open(StringWriter.java:62) > at > org.apache.flink.streaming.connectors.fs.bucketing.BucketingSink.openNewPartFile(BucketingSink.java:545) > at > org.apache.flink.streaming.connectors.fs.bucketing.BucketingSink.invoke(BucketingSink.java:440) > at > org.apache.flink.streaming.api.operators.StreamSink.processElement(StreamSink.java:41) > at > org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.pushToOperator(OperatorChain.java:528) > at > org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.collect(OperatorChain.java:503) > at > org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.collect(OperatorChain.java:483) > at > org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.collect(AbstractStreamOperator.java:891) > at > org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.collect(AbstractStreamOperator.java:869) > at > org.apache.flink.streaming.api.operators.StreamSourceContexts$NonTimestampContext.collect(StreamSourceContexts.java:103) > at > org.apache.flink.streaming.connectors.kafka.internals.AbstractFetcher.emitRecord(AbstractFetcher.java:230) > at > org.apache.flink.streaming.connectors.kafka.internals.SimpleConsumerThread.run(SimpleConsumerThread.java:379) > {code} > Letting developers decide the max open connections to the open files would be > great. -- This message was sent by Atlassian Jira (v8.20.1#820001)