Carl created FLINK-23730: ---------------------------- Summary: Source from hive sink hbase lost data Key: FLINK-23730 URL: https://issues.apache.org/jira/browse/FLINK-23730 Project: Flink Issue Type: Bug Components: Connectors / HBase, Connectors / Hive Affects Versions: 1.12.1 Reporter: Carl
Our use case is as follows, # hive source: create hive table which meta data is in HMS # create hbase use hbase shell # flink sql ddl: create hbase flink table # use hive catalog: use flink sql insert into hbase flink table if i set the tableconfig: table.exec.hive.infer-source-parallelism = false The program will run as one parallelism,and the number of records of results is correct. but if i set the tableconfig: table.exec.hive.infer-source-parallelism = true The program will run as twenty parallelism that express source parallelism is inferred according to splits number,and the number of records of results is not correct. The test was repeated many times and there was no exception occurred. So I guess it has something to do with high concurrency. Does it lose data because of high concurrency? -- This message was sent by Atlassian Jira (v8.3.4#803005)