I have come across a few similar issues while (mis)using the HiveCatalog.
My analysis was that the HiveCatalog owns the hive client connection pool
but shares it with the underlying TableOperations. Depending on the
application the HiveCatalog can be closed (or its finalize method run after
its GC-
It could be that there are two separate flaky test issues with not
releasing connections in Flink and Spark. I don't think that the
HiveCatalog code has been changed much recently, which would point toward
problems elsewhere.
I think one good reason to use HiveCatalog is to catch problems like the
I use the try-with-resource pattern in the FLIP-27 dev branch. I saw this
problem in Flink tests with the master branch too (although less likely).
With the FLIP-27 dev branch and an additional DeleteReadTests, it almost
happened 100%.
Also, the Spark module (in the master branch) also has this fl
OK, there's a try-with-resource to close the TableLoader in
FlinkInputFormat [1]. so we don't have to do the extra try-with-resource
in PR 2051 ( I will close that).
Under my host, I did not reproduce your connection leak issues when
running TestFlinkInputFormatReaderDeletes. Did you have an
> I was able to almost 100% reproduce the HiveMetaStoreClient aborted
connection problem locally with Flink tests after adding
another DeleteReadTests for the new FLIP-27 source impl in my dev branch
I think I found the cause why it's easy to fail. The
TestFlinkInputFormatReaderDeletes will crea
Ryan/OpenInx, thanks a lot for the pointers.
I was able to almost 100% reproduce the HiveMetaStoreClient aborted
connection problem locally with Flink tests after adding
another DeleteReadTests for the new FLIP-27 source impl in my dev branch. I
don't see the problem anymore after switching the Fl
I encountered a similar issue when supporting hive-site.xml for flink hive
catalog. Here is the discussion and solution before:
https://github.com/apache/iceberg/pull/1586#discussion_r509453461
It's a connection leak issue.
On Thu, Jan 7, 2021 at 10:06 AM Ryan Blue wrote:
> I've noticed this
I've noticed this too. I haven't had a chance to track down what's causing
it yet. I've seen it in Spark tests, so it looks like there may be a
problem that affects both. Probably a connection leak in the common code.
On Wed, Jan 6, 2021 at 3:44 PM Steven Wu wrote:
> I have noticed some flakines
I have noticed some flakiness with Flink and Spark tests both locally and
in CI checks. @zhangjun0x01 also reported the same problem with
iceberg-spark3-extensions. Below is a full stack trace from a local run
for Flink tests.
The flakiness might be recent regression, as the tests were stable for