Re: test flakiness with SocketException of broken pipe in HiveMetaStoreClient

2021-01-11 Thread Ryan Murray
I have come across a few similar issues while (mis)using the HiveCatalog. My analysis was that the HiveCatalog owns the hive client connection pool but shares it with the underlying TableOperations. Depending on the application the HiveCatalog can be closed (or its finalize method run after its GC-

Re: test flakiness with SocketException of broken pipe in HiveMetaStoreClient

2021-01-08 Thread Ryan Blue
It could be that there are two separate flaky test issues with not releasing connections in Flink and Spark. I don't think that the HiveCatalog code has been changed much recently, which would point toward problems elsewhere. I think one good reason to use HiveCatalog is to catch problems like the

Re: test flakiness with SocketException of broken pipe in HiveMetaStoreClient

2021-01-08 Thread Steven Wu
I use the try-with-resource pattern in the FLIP-27 dev branch. I saw this problem in Flink tests with the master branch too (although less likely). With the FLIP-27 dev branch and an additional DeleteReadTests, it almost happened 100%. Also, the Spark module (in the master branch) also has this fl

Re: test flakiness with SocketException of broken pipe in HiveMetaStoreClient

2021-01-08 Thread OpenInx
OK, there's a try-with-resource to close the TableLoader in FlinkInputFormat [1]. so we don't have to do the extra try-with-resource in PR 2051 ( I will close that). Under my host, I did not reproduce your connection leak issues when running TestFlinkInputFormatReaderDeletes. Did you have an

Re: test flakiness with SocketException of broken pipe in HiveMetaStoreClient

2021-01-07 Thread OpenInx
> I was able to almost 100% reproduce the HiveMetaStoreClient aborted connection problem locally with Flink tests after adding another DeleteReadTests for the new FLIP-27 source impl in my dev branch I think I found the cause why it's easy to fail. The TestFlinkInputFormatReaderDeletes will crea

Re: test flakiness with SocketException of broken pipe in HiveMetaStoreClient

2021-01-07 Thread Steven Wu
Ryan/OpenInx, thanks a lot for the pointers. I was able to almost 100% reproduce the HiveMetaStoreClient aborted connection problem locally with Flink tests after adding another DeleteReadTests for the new FLIP-27 source impl in my dev branch. I don't see the problem anymore after switching the Fl

Re: test flakiness with SocketException of broken pipe in HiveMetaStoreClient

2021-01-06 Thread OpenInx
I encountered a similar issue when supporting hive-site.xml for flink hive catalog. Here is the discussion and solution before: https://github.com/apache/iceberg/pull/1586#discussion_r509453461 It's a connection leak issue. On Thu, Jan 7, 2021 at 10:06 AM Ryan Blue wrote: > I've noticed this

Re: test flakiness with SocketException of broken pipe in HiveMetaStoreClient

2021-01-06 Thread Ryan Blue
I've noticed this too. I haven't had a chance to track down what's causing it yet. I've seen it in Spark tests, so it looks like there may be a problem that affects both. Probably a connection leak in the common code. On Wed, Jan 6, 2021 at 3:44 PM Steven Wu wrote: > I have noticed some flakines

test flakiness with SocketException of broken pipe in HiveMetaStoreClient

2021-01-06 Thread Steven Wu
I have noticed some flakiness with Flink and Spark tests both locally and in CI checks. @zhangjun0x01 also reported the same problem with iceberg-spark3-extensions. Below is a full stack trace from a local run for Flink tests. The flakiness might be recent regression, as the tests were stable for