Hi Robert, Thank you for confirming that there is an issue. I do not have a solution for it and would like to hear the committer insights what is wrong there.
I think there are actually two issues - the first one is the HBase InputFormat does not close a connection in close(). Another is DataSourceNode not calling the close() method. Cheers, Mark ‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐ On Thursday, August 27, 2020 3:30 PM, Robert Metzger <rmetz...@apache.org> wrote: > Hi Mark, > > Thanks a lot for your message and the good investigation! I believe you've > found a bug in Flink. I filed an issue for the problem: > https://issues.apache.org/jira/browse/FLINK-19064. > > Would you be interested in opening a pull request to fix this? > Otherwise, I'm sure a committer will pick up the issue soon. > > I'm not aware of a simple workaround for the problem. > > Best, > Robert > > On Wed, Aug 26, 2020 at 4:05 PM Mark Davis <moda...@protonmail.com> wrote: > >> Hi, >> >> I am trying to investigate a problem with non-released resources in my >> application. >> >> I have a stateful application which submits Flink DataSetjobs using code >> very similar to the code in CliFrontend. >> I noticed what I am getting a lot of non-closed connections to my data store >> (HBase in my case). The connections are held by the application not the jobs >> themselves. >> >> I am using HBaseRowDataInputFormat and it seems that HBase connections >> opened in the configure() method during the job graph creation(before the >> jobs is executed) are not closed. My search lead me to the method >> DataSourceNode.computeOperatorSpecificDefaultEstimates(DataStatistics) where >> I see that a format is not closed after being configured. >> >> Is that correct? How can I overcome this issue? >> >> My application is long running that is probably why I observe the resource >> leak. Would I spawn a new JVM to run jobs this problem would not be >> noticeable. >> >> Thank you! >> >> Cheers, >> Marc