Hi Caizhi,

 

Flink version is 1.13.2 and yes I’m putting data into HBase inside 
RichMapFunction. But even if I try to use a sink (Table API sink?), wouldn't 
enrichment process in RichMapFunction still be able to skip messages in case of 
problems on HBase side?

. 

 

From: Caizhi Weng [mailto:tsreape...@gmail.com] 
Sent: Wednesday, November 24, 2021 4:47 AM
To: Anton <anton...@yandex.ru>
Cc: user <user@flink.apache.org>
Subject: Re: Working with HBase inside RichMapFunction

 

Hi!

 

Which Flink version are you using? Are you putting data into HBase inside 
RichMapFunction, instead of using an HBase sink? If yes could you share your 
user code? Actually using an HBase sink is recommended because it will check if 
any error occurs during the operation and will fail the job (and restarts from 
checkpoint) if it spots any errors, so that no data loss will occur.

 

Anton <anton...@yandex.ru <mailto:anton...@yandex.ru> > 于2021年11月24日周三 上午4:29写道:

Hi, I’m using RichMapFunction to enrich data from stream generated from Kafka 
topic and put rich data again to HBase. And when there is a failure on HBase 
side I’m seeing in Flink’s log that HBase client attempts several times to get 
necessary data from HBase - I believe it makes it `hbase.client.retries.number` 
times - and after retry count exceeded the data become just lost and Flink job 
moves to next record from stream. So the question is how to avoid this data 
loss? I guess making `hbase.client.retries.number` just bigger is not ideal 
solution.

Reply via email to