[ https://issues.apache.org/jira/browse/FLINK-23730?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Carl updated FLINK-23730: ------------------------- Attachment: image-2021-08-26-09-44-20-390.png > Source from hive sink hbase lost data > ------------------------------------- > > Key: FLINK-23730 > URL: https://issues.apache.org/jira/browse/FLINK-23730 > Project: Flink > Issue Type: Bug > Components: Connectors / HBase, Connectors / Hive > Affects Versions: 1.12.1 > Reporter: Carl > Priority: Major > Attachments: image-2021-08-26-09-43-39-055.png, > image-2021-08-26-09-44-20-390.png > > > Our use case is as follows, > # hive source: create hive table which meta data is in HMS > # create hbase use hbase shell > # flink sql ddl: create hbase flink table > # use hive catalog: use flink sql insert into hbase flink table > if i set the tableconfig: table.exec.hive.infer-source-parallelism = false > The program will run as one parallelism,and the number of records of results > is correct. > but if i set the tableconfig: table.exec.hive.infer-source-parallelism = true > The program will run as twenty parallelism that express source parallelism is > inferred according to splits number,and the number of records of results is > not correct. > > The test was repeated many times and there was no exception occurred. > > So I guess it has something to do with high concurrency. Does it lose data > because of high concurrency? > > > -- This message was sent by Atlassian Jira (v8.3.4#803005)