Hi, We are running a medium sized HBase cluster (12 data nodes) with around 200 TB of data (w/o replication). When a node fails, the time to (fully) recover is in the order of 30 minutes. We’re looking for ways to reduce this. Almost two years ago, we already ‘discovered’ the hbase.wal.split.to.hfile setting, but didn’t dare turn it on because of data-loss concerns based of some JIRA tickets in this area. Can anyone comment on its current status? Is it safe to use?
Best regards, Frens Jan Award-winning OSINT partner for Law Enforcement and Defence. Frens Jan Rumph Data platform engineering lead phone: site: pgp: +31 50 21 11 622 web-iq.com <https://web-iq.com/> CEE2 A4F1 972E 78C0 F816 86BB D096 18E2 3AC0 16E0 The content of this email is confidential and intended for the recipient(s) specified in this message only. It is strictly forbidden to share any part of this message with any third party, without a written consent of the sender. If you received this message by mistake, please reply to this message and follow with its deletion, so that we can ensure such a mistake does not occur in the future.
signature.asc
Description: Message signed with OpenPGP